STAG C++  2.0.0
Spectral Toolkit of Algorithms for Graphs
Loading...
Searching...
No Matches
stag::E2LSH Class Reference

#include <lsh.h>

Description

A Euclidean locality sensitive hash table.

The E2LSH hash table is constructed with some set of data points, which are hashed with several copies of the stag::LSHFunction.

Then, for any query point, the data structure returns the points in the original dataset which are close to the query point. The probability that a given point \(x\) in the data set is returned for query \(q\) is dependent on the distance between \(q\) and \(x\).

The E2LSH hash table takes two parameters, K and L, which control the probability that two points will collide in the hash table. For query point \(q\), a data point \(x\) at distance \(c \in \mathbb{R}\) from \(q\) is returned with probability

\[ 1 - (1 - p(c)^K)^L, \]

where \(p(c)\) is the probability that a single stag::LSHFunction will hash \(q\) and \(x\) to the same value. This probability can be computed with the stag::E2LSH::collision_probability method.

Larger values of K and L will increase both the construction and query time of the hash table.

Public Member Functions

 E2LSH (StagUInt K, StagUInt L, std::vector< DataPoint > &dataSet)
 
std::vector< DataPointget_near_neighbors (const DataPoint &query)
 
StagReal collision_probability (StagReal distance)
 

Static Public Member Functions

static StagReal collision_probability (StagUInt K, StagUInt L, StagReal distance)
 

Constructor & Destructor Documentation

◆ E2LSH()

stag::E2LSH::E2LSH ( StagUInt  K,
StagUInt  L,
std::vector< DataPoint > &  dataSet 
)

Initialise the E2LSH hash table.

Parameters
Kparameter K of the hash table
Lparameter L of the hash table
dataSeta pointer to the dataSet to be hashed into the hash table. The actual data should be stored and controlled by the calling code, and this vector of data point pointers will be used by the LSH table.

Member Function Documentation

◆ get_near_neighbors()

std::vector< DataPoint > stag::E2LSH::get_near_neighbors ( const DataPoint query)

Query the LSH table to find the near neighbors of a given query point.

Each point in the dataset will be returned with some probability dependent on the distance to the query point and the parameters K and L.

Parameters
querythe data point to be queried.
Returns

◆ collision_probability() [1/2]

StagReal stag::E2LSH::collision_probability ( StagReal  distance)

Compute the probability that a data point at a given distance from a query point will be returned by this hash table.

Parameters
distancethe distance between a query point and data point
Returns

◆ collision_probability() [2/2]

static StagReal stag::E2LSH::collision_probability ( StagUInt  K,
StagUInt  L,
StagReal  distance 
)
static

For a given value of K and L, return the probability that a point at a given distance from a query point will be returned by an E2LSH table with the given parameters.

Parameters
Kthe parameter K of the ELSH table
Lthe parameter L of the E2LSH table
distancethe distance between a query point and data point
Returns