STAG Python  2.0.2
Spectral Toolkit of Algorithms for Graphs
Loading...
Searching...
No Matches
stag.lsh.E2LSH Class Reference

Description

A Euclidean locality sensitive hash table.

The E2LSH hash table is constructed with some set of data points, which are hashed with several copies of the stag::LSHFunction.

Then, for any query point, the data structure returns the points in the original dataset which are close to the query point. The probability that a given point \(x\) in the data set is returned for query \(q\) is dependent on the distance between \(q\) and \(x\).

The E2LSH hash table takes two parameters, K and L, which control the probability that two points will collide in the hash table. For query point \(q\), a data point \(x\) at distance \(c \in \mathbb{R}\) from \(q\) is returned with probability

\[ 1 - (1 - p(c)^K)^L, \]

where \(p(c)\) is the probability that a single stag::LSHFunction will hash \(q\) and \(x\) to the same value. This probability can be computed with the stag::E2LSH::collision_probability method.

Larger values of K and L will increase both the construction and query time of the hash table.

Public Member Functions

def __init__ (self, int K, int L, List[stag.data.DataPoint] dataset)
 
List[stag.data.DataPointget_near_neighbors (self, stag.data.DataPoint query)
 Query the LSH table to find the near neighbors of a given query point.
 
float collision_probability (self, float distance)
 Compute the probability that a data point at a given distance from a query point will be returned by this hash table.
 

Public Attributes

 K
 
 L
 
 internal_e2lsh
 

Constructor & Destructor Documentation

◆ __init__()

def stag.lsh.E2LSH.__init__ (   self,
int  K,
int  L,
List[stag.data.DataPoint dataset 
)

Member Function Documentation

◆ get_near_neighbors()

List[stag.data.DataPoint] stag.lsh.E2LSH.get_near_neighbors (   self,
stag.data.DataPoint  query 
)

Query the LSH table to find the near neighbors of a given query point.

Each point in the dataset will be returned with some probability dependent on the distance to the query point and the parameters K and L.

Parameters
querythe data point to be queried
Returns
a list of stag.data.DataPoint objects representing the colliding data points.

◆ collision_probability()

float stag.lsh.E2LSH.collision_probability (   self,
float  distance 
)

Compute the probability that a data point at a given distance from a query point will be returned by this hash table.

Parameters
distancethe distance between a query point and data point

Member Data Documentation

◆ K

stag.lsh.E2LSH.K

◆ L

stag.lsh.E2LSH.L

◆ internal_e2lsh

stag.lsh.E2LSH.internal_e2lsh