![]() |
STAG C++
2.0.0
Spectral Toolkit of Algorithms for Graphs
|
Methods for computing approximate kernel density estimation.
Given some kernel function \(k: \mathbb{R}^d \times \mathbb{R}^d \rightarrow \mathbb{R}\), a set of data points \(x_1, \ldots, x_n \in \mathbb{R}^d\), and a query point \(q \in \mathbb{R}^d\), the kernel density of \(q\) is given by
\[ K(q) = \frac{1}{n} \sum_{i = 1}^n k(q, x_i). \]
A common kernel function is the Gaussian kernel function
\[ k(u, v) = \exp\left(- a \|u - v\|_2^2\right), \]
where \(a \geq 0\) is a parameter controlling the 'bandwidth' of the kernel.
Computing the kernel density for a query point exactly requires computing the distance from the query point to every data point, requiring \(\Omega(n d)\) time. This motivates the study of kernel density estimation, in which the goal is to estimate the kernel density within some error tolerance, in faster time than computing it exactly. Specifically, given some error parameter \(\epsilon\), a kernel density estimation algorithm will return \(\mathrm{KDE}(q)\) for some query point \(q\) such that
\[ (1 - \epsilon) K(q) \leq \mathrm{KDE}(q) \leq (1 + \epsilon) K(q). \]
This module provides the stag::CKNSGaussianKDE data structure which takes \(O(\epsilon^{-1} n^{1.25})\) time for initialisation, and can then provide KDE estimates in time \(O(\epsilon^{-2} n^{0.25})\) for each query.
Classes | |
class | stag::CKNSGaussianKDE |
A CKNS Gaussian KDE data structure. More... | |
Functions | |
StagReal | stag::gaussian_kernel (StagReal a, const stag::DataPoint &u, const stag::DataPoint &v) |
StagReal | stag::gaussian_kernel (StagReal a, StagReal c) |
StagReal stag::gaussian_kernel | ( | StagReal | a, |
const stag::DataPoint & | u, | ||
const stag::DataPoint & | v | ||
) |
Compute the Gaussian kernel similarity between the points u and v.
Given a parameter \(a \geq 0\) and points \(u, v \in \mathbb{R}^n\), the Gaussian kernel similarity between \(u\) and \(v\) is given by
\[ k(u, v) = \exp\left( - a \|u - v\|^2_2 \right). \]
Note that the Gaussian kernel is sometimes parameterised by \(\sigma^2\), which is related to our parameter \(a\) by
\[ a = \frac{1}{\sigma^2}. \]
a | the parameter a in the Gaussian kernel. |
u | a data point \(u\) |
v | a data point \(v\) |
Compute the Gaussian kernel similarity for two points at a squared distance \(c\).
Given a parameter \(a \geq 0\), the Gaussian kernel similarity between two points at distance \(c\) is given by
\[ \exp\left( - a c \right). \]
a | the parameter a in the Gaussian kernel. |
c | the squared distance between two points. |