![]() |
STAG Python
2.0.2
Spectral Toolkit of Algorithms for Graphs
|
A Euclidean locality sensitive hash table.
The E2LSH hash table is constructed with some set of data points, which are hashed with several copies of the stag::LSHFunction.
Then, for any query point, the data structure returns the points in the original dataset which are close to the query point. The probability that a given point \(x\) in the data set is returned for query \(q\) is dependent on the distance between \(q\) and \(x\).
The E2LSH hash table takes two parameters, K and L, which control the probability that two points will collide in the hash table. For query point \(q\), a data point \(x\) at distance \(c \in \mathbb{R}\) from \(q\) is returned with probability
\[ 1 - (1 - p(c)^K)^L, \]
where \(p(c)\) is the probability that a single stag::LSHFunction will hash \(q\) and \(x\) to the same value. This probability can be computed with the stag::E2LSH::collision_probability method.
Larger values of K and L will increase both the construction and query time of the hash table.
Public Member Functions | |
def | __init__ (self, int K, int L, List[stag.data.DataPoint] dataset) |
List[stag.data.DataPoint] | get_near_neighbors (self, stag.data.DataPoint query) |
Query the LSH table to find the near neighbors of a given query point. | |
float | collision_probability (self, float distance) |
Compute the probability that a data point at a given distance from a query point will be returned by this hash table. | |
Public Attributes | |
K | |
L | |
internal_e2lsh | |
def stag.lsh.E2LSH.__init__ | ( | self, | |
int | K, | ||
int | L, | ||
List[stag.data.DataPoint] | dataset | ||
) |
List[stag.data.DataPoint] stag.lsh.E2LSH.get_near_neighbors | ( | self, | |
stag.data.DataPoint | query | ||
) |
Query the LSH table to find the near neighbors of a given query point.
Each point in the dataset will be returned with some probability dependent on the distance to the query point and the parameters K and L.
query | the data point to be queried |
float stag.lsh.E2LSH.collision_probability | ( | self, | |
float | distance | ||
) |
Compute the probability that a data point at a given distance from a query point will be returned by this hash table.
distance | the distance between a query point and data point |
stag.lsh.E2LSH.K |
stag.lsh.E2LSH.L |
stag.lsh.E2LSH.internal_e2lsh |