Skip to content

Reference similarity

similarity.calibration

LogisticCalibration()

Performs logistic regression calibration.

fit(scores, hits)

Fit the logistic regression model to calibrate raw scores.

Parameters:

Name Type Description Default
scores ndarray

Raw uncalibrated scores.

required
hits ndarray

Ground truth binary labels.

required

predict(scores)

Predict calibrated scores using a fitted calibration model.

Parameters:

Name Type Description Default
scores ndarray

Raw uncalibrated scores.

required

Returns:

Name Type Description
prediction ndarray

Calibrated scores.

IsotonicCalibration(interpolate=True, strict=True)

Performs isotonic regression calibration for ranking.

Compared to standard isotonic regression, this implementation uses spline interpolation to ensure that the calibration curve is strictly increasing, which is necessary for ranking.

Parameters:

Name Type Description Default
interpolate bool

If True, use spline interpolation for calibration.

True
strict bool

If True, apply strict adjustment to predictions.

True

fit(scores, hits)

Fit the isotonic regression model to calibrate the scores.

Parameters:

Name Type Description Default
scores ndarray

Raw uncalibrated scores.

required
hits ndarray

Ground truth binary labels.

required

predict(scores)

Predict calibrated scores using a fitted calibration model.

Parameters:

Name Type Description Default
scores ndarray

Raw uncalibrated scores.

required

Returns:

Name Type Description
calibrated_scores ndarray

Calibrated scores.

reliability_diagram(scores, hits, ax=None, skip_plot=False, num_bins=10, title='Reliability Diagram')

Calculates ECE (Expected calibration error) and plots reliability diagram for a given set of scores and hits.

Parameters:

Name Type Description Default
scores ndarray

Raw uncalibrated scores.

required
hits ndarray

Ground truth binary labels.

required
ax Axes

Axes to plot the diagram on. If None, a new figure is created.

None
skip_plot bool

If True, only return ECE value.

False
num_bins int

Number of bins to divide the scores into.

10
title str

Title of the plot.

'Reliability Diagram'

Returns:

Name Type Description
ece float

Expected Calibration Error.

similarity.cosine

CosineSimilarity

Wraps cosine similarity to be usable in SimilarityPipeline.

__call__(query, database, **kwargs)

Calculates cosine similarity given query and database feature datasets.

Parameters:

Name Type Description Default
query FeatureDataset

Query dataset of deep features.

required
database FeatureDataset

Database dataset of deep features.

required

Returns:

Name Type Description
similarity ndarray

2D numpy array with cosine similarity.

cosine_similarity(a, b)

Calculate cosine similarity between two sets of vectors. Pytorch Equivalent to sklearn.metrics.pairwise.cosine_similarity.

similarity.wildfusion

SimilarityPipeline(matcher=None, extractor=None, calibration=None, transform=None)

Implements pipeline for matching and calculating similarity scores between two image datasets.

Given two (query and database) image datasets, the pipeline consists of the following steps:

1. Apply image transforms.
2. Extract features for both datasets.
3. Compute similarity scores between query and database images.
4. Calibrate similarity scores.

Parameters:

Name Type Description Default
matcher callable

A matcher that computes scores between two feature datasets.

None
extractor callable

A function to extract features from the image datasets. Not needed for some matchers.

None
calibration callable

A calibration model to refine similarity scores.

None
transform callable

Image transformation function applied before feature extraction.

None

fit_calibration(dataset0, dataset1)

Fit the calibration model using given two image datasets. Fitting the calibration model uses all possible pairs of images from the two datasets. Input scores are similarity scores calculated by the matcher. Binary input labels are based on ground truth labels (identity is the same or not).

Parameters:

Name Type Description Default
dataset0 ImageDataset

The first dataset (e.g., part of training set).

required
dataset1 ImageDataset

The second dataset (e.g., part of training set).

required

__call__(dataset0, dataset1, pairs=None)

Compute similarity scores between two image datasets, with optional calibration.

Parameters:

Name Type Description Default
dataset0 ImageDataset

The first dataset (e.g., query set).

required
dataset1 ImageDataset

The second dataset (e.g., database set).

required
pairs list of tuples

Specific pairs of images to compute similarity scores. If None, compute similarity scores for all pairs.

None

Returns:

Type Description
ndarray

np.ndarray: 2D array of similarity scores between the query and database images. If calibration is provided, return the calibrated similarity scores.

WildFusion(calibrated_pipelines, priority_pipeline=None)

WildFusion uses the mean of multiple calibrated SimilarityPipeline to calculate fused scores.

Since many local feature matching models require deep neural network inference for each query and database pair, the computation quickly becomes infeasible even for moderately sized datasets.

WildFusion can be used with a limited computational budget by applying it only B times per query image. It uses a fast-to-compute similarity score (e.g., cosine similarity of deep features) provided by the priority_pipeline to construct a shortlist of the most promising matches for a given query. Final ranking is then based on WildFusion scores calculated for the pairs in the shortlist.

Parameters:

Name Type Description Default
calibrated_pipelines list[SimilarityPipeline]

List of SimilarityPipeline objects.

required
priority_pipeline SimilarityPipeline

Fast-to-compute similarity matcher used for shortlisting.

None

fit_calibration(dataset0, dataset1)

Fit the all calibration models for all matchers in calibrated_pipelines.

Parameters:

Name Type Description Default
dataset0 ImageDataset

The first dataset (e.g., part of training set).

required
dataset1 ImageDataset

The second dataset (e.g., part of training set).

required

__call__(dataset0, dataset1, pairs=None, B=None)

Compute fused similarity scores between two images datasets using multiple calibrated matchers. WildFusion score is is calculated as mean of calibrated similarity scores.

Optionally, to limit the number of pairs to compute, shortlist strategy to select the most promising pairs can be used.

Parameters:

Name Type Description Default
dataset0 ImageDataset

The first dataset (e.g., query set).

required
dataset1 ImageDataset

The second dataset (e.g., database set).

required
pairs list of tuples

Specific pairs of images to compute similarity scores. If None, compute similarity scores for all pairs. Is ignored if B is provided.

None
B int

Number of pairs to compute similarity scores for. Required priority_pipeline to be assigned. If None, compute similarity scores for all pairs.

None

Returns:

Name Type Description
score_combined ndarray

2D array of similarity scores between the query and database images. If calibration is provided, returns the calibrated similarity scores.

similarity.pairwise.base

MatchPairs(batch_size=128, num_workers=0, tqdm_silent=False, collector=None)

Base class for matching pairs from two datasets. Any child class needs to implement get_matches method that implements processing of pair batches.

Parameters:

Name Type Description Default
batch_size int

Number of pairs processed in one batch.

128
num_workers int

Number of workers used for data loading.

0
tqdm_silent (int, bool)

If True, progress bar is disabled.

False
collector (int, CollectCounts)

Collector object used for storing results.

None

__call__(dataset0, dataset1, pairs=None)

Match pairs of features from two feature datasets. Output for each pair is stored and processed using the collector.

Parameters:

Name Type Description Default
dataset0 FeatureDataset

First dataset (e.g. query).

required
dataset1 FeatureDataset

Second dataset (e.g. database).

required
pairs ndarray | None

Numpy array with pairs of indexes. If None, all pairs are used.

None

Returns:

Name Type Description
results dict

Exact output is determined by the used collector.

get_matches(batch)

Process batch and get matches of pairs for the batch. Implemented in child classes.

Parameters:

Name Type Description Default
batch tuple

4-tuple with indexes and data from PairDataset.

required

Returns:

Name Type Description
results List[dict]

list of standartized dictionaries with keys: idx0, idx1, score, kpts0, kpts1. Length of list is equal to batch size.

similarity.pairwise.lightglue

MatchLightGlue(features, init_threshold=0.1, device=None, **kwargs)

Bases: MatchPairs

Implements matching using LightGlue model correspondences. Introduced in: "LightGlue: Local Feature Matching at Light Speed"

Parameters:

Name Type Description Default
features str

Features used for matching. Options: 'sift', 'superpoint', 'aliked', 'disk'. Must match extracted features from the dataset.

required
init_threshold float

Keep matches only over this threshold. Matches with lower values are not passed to the collector.

0.1
device str

Device used for inference. Defaults to None.

None

similarity.pairwise.loftr

MatchLOFTR(pretrained='outdoor', init_threshold=0.2, device=None, apply_fine=False, **kwargs)

Bases: MatchPairs

Implements matching pairs using LoFTR model correspondences. Introduced in: "LoFTR: Detector-Free Local Feature Matching with Transformers"

Parameters:

Name Type Description Default
pretrained str

LOFTR model used. outdoor or indoor.

'outdoor'
device str | None

Specifies device used for the inference.

None
init_threshold float

Keep matches only over this threshold.

0.2
apply_fine bool

Use LoFTR fine refinement of keypoints locations. Has no effect on confidence, but is faster without fine refinement. False by default.

False

similarity.pairwise.collectors

CollectAll(**kwargs)

Collect the results without additional processing. Collected data is list of matcher results for each pair. Usefull for keypoint visualizations.

CollectCounts(grid_dtype='float16', thresholds=(0.5,), **kwargs)

Collect count of significant matches given confidence thresholds. Output is stored in [n_query x n_database] grid.

If multiple thresholds are provided, returns a dictionary with each threshold as a key and the corresponding grid as value.

Parameters:

Name Type Description Default
grid_dtype str

Data type of the output grid.

'float16'
thresholds tuple

Confidence thresholds for counting.

(0.5,)

CollectCountsRansac(grid_dtype='float16', ransacReprojThreshold=1.0, method=cv2.USAC_MAGSAC, confidence=0.999, maxIters=100, **kwargs)

Bases: CollectCounts

Collect count of RANSAC inliers of fundamental matrix estimate. Output is stored in [n_query x n_database] grid.

Parameters:

Name Type Description Default
grid_dtype str

Data type of the output grid.

'float16'
ransacReprojThreshold float

OpenCV RANSAC reprojection threshold.

1.0
method Any

OpenCV RANSAC method.

USAC_MAGSAC
confidence float

OpenCV RANSAC confidence.

0.999
maxIters float

OpenCV RANSAC max iterations.

100