![]() |
STAG Python
2.0.2
Spectral Toolkit of Algorithms for Graphs
|
A Euclidean locality sensitive hash table.
The E2LSH hash table is constructed with some set of data points, which are hashed with several copies of the stag::LSHFunction.
Then, for any query point, the data structure returns the points in the original dataset which are close to the query point. The probability that a given point \(x\) in the data set is returned for query \(q\) is dependent on the distance between \(q\) and \(x\).
The E2LSH hash table takes two parameters, K and L, which control the probability that two points will collide in the hash table. For query point \(q\), a data point \(x\) at distance \(c \in \mathbb{R}\) from \(q\) is returned with probability
\[ 1 - (1 - p(c)^K)^L, \]
where \(p(c)\) is the probability that a single stag::LSHFunction will hash \(q\) and \(x\) to the same value. This probability can be computed with the stag::E2LSH::collision_probability method.
Larger values of K and L will increase both the construction and query time of the hash table.
Public Member Functions | |
| def | __init__ (self, int K, int L, List[stag.data.DataPoint] dataset) |
| List[stag.data.DataPoint] | get_near_neighbors (self, stag.data.DataPoint query) |
| Query the LSH table to find the near neighbors of a given query point. | |
| float | collision_probability (self, float distance) |
| Compute the probability that a data point at a given distance from a query point will be returned by this hash table. | |
Public Attributes | |
| K | |
| L | |
| internal_e2lsh | |
| def stag.lsh.E2LSH.__init__ | ( | self, | |
| int | K, | ||
| int | L, | ||
| List[stag.data.DataPoint] | dataset | ||
| ) |
| List[stag.data.DataPoint] stag.lsh.E2LSH.get_near_neighbors | ( | self, | |
| stag.data.DataPoint | query | ||
| ) |
Query the LSH table to find the near neighbors of a given query point.
Each point in the dataset will be returned with some probability dependent on the distance to the query point and the parameters K and L.
| query | the data point to be queried |
| float stag.lsh.E2LSH.collision_probability | ( | self, | |
| float | distance | ||
| ) |
Compute the probability that a data point at a given distance from a query point will be returned by this hash table.
| distance | the distance between a query point and data point |
| stag.lsh.E2LSH.K |
| stag.lsh.E2LSH.L |
| stag.lsh.E2LSH.internal_e2lsh |