Reference similarity
similarity.calibration
LogisticCalibration()
Performs logistic regression calibration.
fit(scores, hits)
Fit the logistic regression model to calibrate raw scores.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
scores
|
ndarray
|
Raw uncalibrated scores. |
required |
hits
|
ndarray
|
Ground truth binary labels. |
required |
predict(scores)
Predict calibrated scores using a fitted calibration model.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
scores
|
ndarray
|
Raw uncalibrated scores. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
prediction |
ndarray
|
Calibrated scores. |
IsotonicCalibration(interpolate=True, strict=True)
Performs isotonic regression calibration for ranking.
Compared to standard isotonic regression, this implementation uses spline interpolation to ensure that the calibration curve is strictly increasing, which is necessary for ranking.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
interpolate
|
bool
|
If True, use spline interpolation for calibration. |
True
|
strict
|
bool
|
If True, apply strict adjustment to predictions. |
True
|
fit(scores, hits)
Fit the isotonic regression model to calibrate the scores.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
scores
|
ndarray
|
Raw uncalibrated scores. |
required |
hits
|
ndarray
|
Ground truth binary labels. |
required |
predict(scores)
Predict calibrated scores using a fitted calibration model.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
scores
|
ndarray
|
Raw uncalibrated scores. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
calibrated_scores |
ndarray
|
Calibrated scores. |
reliability_diagram(scores, hits, ax=None, skip_plot=False, num_bins=10, title='Reliability Diagram')
Calculates ECE (Expected calibration error) and plots reliability diagram for a given set of scores and hits.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
scores
|
ndarray
|
Raw uncalibrated scores. |
required |
hits
|
ndarray
|
Ground truth binary labels. |
required |
ax
|
Axes
|
Axes to plot the diagram on. If None, a new figure is created. |
None
|
skip_plot
|
bool
|
If True, only return ECE value. |
False
|
num_bins
|
int
|
Number of bins to divide the scores into. |
10
|
title
|
str
|
Title of the plot. |
'Reliability Diagram'
|
Returns:
| Name | Type | Description |
|---|---|---|
ece |
float
|
Expected Calibration Error. |
similarity.cosine
CosineSimilarity
Wraps cosine similarity to be usable in SimilarityPipeline.
__call__(query, database, **kwargs)
Calculates cosine similarity given query and database feature datasets.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
query
|
FeatureDataset
|
Query dataset of deep features. |
required |
database
|
FeatureDataset
|
Database dataset of deep features. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
similarity |
ndarray
|
2D numpy array with cosine similarity. |
cosine_similarity(a, b)
Calculate cosine similarity between two sets of vectors.
Pytorch Equivalent to sklearn.metrics.pairwise.cosine_similarity.
similarity.wildfusion
SimilarityPipeline(matcher=None, extractor=None, calibration=None, transform=None)
Implements pipeline for matching and calculating similarity scores between two image datasets.
Given two (query and database) image datasets, the pipeline consists of the following steps:
1. Apply image transforms.
2. Extract features for both datasets.
3. Compute similarity scores between query and database images.
4. Calibrate similarity scores.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
matcher
|
callable
|
A matcher that computes scores between two feature datasets. |
None
|
extractor
|
callable
|
A function to extract features from the image datasets. Not needed for some matchers. |
None
|
calibration
|
callable
|
A calibration model to refine similarity scores. |
None
|
transform
|
callable
|
Image transformation function applied before feature extraction. |
None
|
fit_calibration(dataset0, dataset1)
Fit the calibration model using given two image datasets. Fitting the calibration model uses all possible pairs of images from the two datasets. Input scores are similarity scores calculated by the matcher. Binary input labels are based on ground truth labels (identity is the same or not).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dataset0
|
ImageDataset
|
The first dataset (e.g., part of training set). |
required |
dataset1
|
ImageDataset
|
The second dataset (e.g., part of training set). |
required |
__call__(dataset0, dataset1, pairs=None)
Compute similarity scores between two image datasets, with optional calibration.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dataset0
|
ImageDataset
|
The first dataset (e.g., query set). |
required |
dataset1
|
ImageDataset
|
The second dataset (e.g., database set). |
required |
pairs
|
list of tuples
|
Specific pairs of images to compute similarity scores. If None, compute similarity scores for all pairs. |
None
|
Returns:
| Type | Description |
|---|---|
ndarray
|
np.ndarray: 2D array of similarity scores between the query and database images.
If |
WildFusion(calibrated_pipelines, priority_pipeline=None)
WildFusion uses the mean of multiple calibrated SimilarityPipeline to calculate fused scores.
Since many local feature matching models require deep neural network inference for each query and database pair, the computation quickly becomes infeasible even for moderately sized datasets.
WildFusion can be used with a limited computational budget by applying it only B times per query image. It uses a fast-to-compute similarity score (e.g., cosine similarity of deep features) provided by the priority_pipeline to construct a shortlist of the most promising matches for a given query. Final ranking is then based on WildFusion scores calculated for the pairs in the shortlist.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
calibrated_pipelines
|
list[SimilarityPipeline]
|
List of SimilarityPipeline objects. |
required |
priority_pipeline
|
SimilarityPipeline
|
Fast-to-compute similarity matcher used for shortlisting. |
None
|
fit_calibration(dataset0, dataset1)
Fit the all calibration models for all matchers in calibrated_pipelines.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dataset0
|
ImageDataset
|
The first dataset (e.g., part of training set). |
required |
dataset1
|
ImageDataset
|
The second dataset (e.g., part of training set). |
required |
__call__(dataset0, dataset1, pairs=None, B=None)
Compute fused similarity scores between two images datasets using multiple calibrated matchers. WildFusion score is is calculated as mean of calibrated similarity scores.
Optionally, to limit the number of pairs to compute, shortlist strategy to select the most promising pairs can be used.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dataset0
|
ImageDataset
|
The first dataset (e.g., query set). |
required |
dataset1
|
ImageDataset
|
The second dataset (e.g., database set). |
required |
pairs
|
list of tuples
|
Specific pairs of images to compute similarity scores.
If None, compute similarity scores for all pairs.
Is ignored if |
None
|
B
|
int
|
Number of pairs to compute similarity scores for. Required |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
score_combined |
ndarray
|
2D array of similarity scores between the query and database images.
If |
similarity.pairwise.base
MatchPairs(batch_size=128, num_workers=0, tqdm_silent=False, collector=None)
Base class for matching pairs from two datasets.
Any child class needs to implement get_matches method that implements processing of pair batches.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
batch_size
|
int
|
Number of pairs processed in one batch. |
128
|
num_workers
|
int
|
Number of workers used for data loading. |
0
|
tqdm_silent
|
(int, bool)
|
If True, progress bar is disabled. |
False
|
collector
|
(int, CollectCounts)
|
Collector object used for storing results. |
None
|
__call__(dataset0, dataset1, pairs=None)
Match pairs of features from two feature datasets. Output for each pair is stored and processed using the collector.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dataset0
|
FeatureDataset
|
First dataset (e.g. query). |
required |
dataset1
|
FeatureDataset
|
Second dataset (e.g. database). |
required |
pairs
|
ndarray | None
|
Numpy array with pairs of indexes. If None, all pairs are used. |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
results |
dict
|
Exact output is determined by the used collector. |
get_matches(batch)
Process batch and get matches of pairs for the batch. Implemented in child classes.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
batch
|
tuple
|
4-tuple with indexes and data from PairDataset. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
results |
List[dict]
|
list of standartized dictionaries with keys: idx0, idx1, score, kpts0, kpts1. Length of list is equal to batch size. |
similarity.pairwise.lightglue
MatchLightGlue(features, init_threshold=0.1, device=None, **kwargs)
Bases: MatchPairs
Implements matching using LightGlue model correspondences. Introduced in: "LightGlue: Local Feature Matching at Light Speed"
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
features
|
str
|
Features used for matching. Options: 'sift', 'superpoint', 'aliked', 'disk'. Must match extracted features from the dataset. |
required |
init_threshold
|
float
|
Keep matches only over this threshold. Matches with lower values are not passed to the collector. |
0.1
|
device
|
str
|
Device used for inference. Defaults to None. |
None
|
similarity.pairwise.loftr
MatchLOFTR(pretrained='outdoor', init_threshold=0.2, device=None, apply_fine=False, **kwargs)
Bases: MatchPairs
Implements matching pairs using LoFTR model correspondences. Introduced in: "LoFTR: Detector-Free Local Feature Matching with Transformers"
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
pretrained
|
str
|
LOFTR model used. |
'outdoor'
|
device
|
str | None
|
Specifies device used for the inference. |
None
|
init_threshold
|
float
|
Keep matches only over this threshold. |
0.2
|
apply_fine
|
bool
|
Use LoFTR fine refinement of keypoints locations. Has no effect on confidence, but is faster without fine refinement. False by default. |
False
|
similarity.pairwise.collectors
CollectAll(**kwargs)
Collect the results without additional processing. Collected data is list of matcher results for each pair. Usefull for keypoint visualizations.
CollectCounts(grid_dtype='float16', thresholds=(0.5,), **kwargs)
Collect count of significant matches given confidence thresholds. Output is stored in [n_query x n_database] grid.
If multiple thresholds are provided, returns a dictionary with each threshold as a key and the corresponding grid as value.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
grid_dtype
|
str
|
Data type of the output grid. |
'float16'
|
thresholds
|
tuple
|
Confidence thresholds for counting. |
(0.5,)
|
CollectCountsRansac(grid_dtype='float16', ransacReprojThreshold=1.0, method=cv2.USAC_MAGSAC, confidence=0.999, maxIters=100, **kwargs)
Bases: CollectCounts
Collect count of RANSAC inliers of fundamental matrix estimate. Output is stored in [n_query x n_database] grid.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
grid_dtype
|
str
|
Data type of the output grid. |
'float16'
|
ransacReprojThreshold
|
float
|
OpenCV RANSAC reprojection threshold. |
1.0
|
method
|
Any
|
OpenCV RANSAC method. |
USAC_MAGSAC
|
confidence
|
float
|
OpenCV RANSAC confidence. |
0.999
|
maxIters
|
float
|
OpenCV RANSAC max iterations. |
100
|