WildFusion - calibrated score fusion
The similarity.wildfusion
module provides a tools to combine any set of similarity scores using score calibration. For example, cosine similarity between deep features outputs scores in the [-1, 1] interval, while scores obtained using local feature matching range from 0 to infinity. To combine them, calibration is used to convert any raw similarity score into a probability that two images represent the same identity.
This functionality is implemented using WildFusion
class, which uses multiple SimilarityPipeline
objects as building blocks, which implements pipeline of matching and calculating calibrated similarity scores. In addition, WildFusion
class allows significant speeds up of calculation of matching scores.
similarity.wildfusion
SimilarityPipeline(matcher=None, extractor=None, calibration=None, transform=None)
Implements pipeline for matching and calculating similarity scores between two image datasets.
Given two (query and database) image datasets, the pipeline consists of the following steps:
1. Apply image transforms.
2. Extract features for both datasets.
3. Compute similarity scores between query and database images.
4. Calibrate similarity scores.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
matcher |
callable
|
A matcher that computes scores between two feature datasets. |
None
|
extractor |
callable
|
A function to extract features from the image datasets. Not needed for some matchers. |
None
|
calibration |
callable
|
A calibration model to refine similarity scores. |
None
|
transform |
callable
|
Image transformation function applied before feature extraction. |
None
|
fit_calibration(dataset0, dataset1)
Fit the calibration model using given two image datasets. Fitting the calibration model uses all possible pairs of images from the two datasets. Input scores are similarity scores calculated by the matcher. Binary input labels are based on ground truth labels (identity is the same or not).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dataset0 |
ImageDataset
|
The first dataset (e.g., part of training set). |
required |
dataset1 |
ImageDataset
|
The second dataset (e.g., part of training set). |
required |
__call__(dataset0, dataset1, pairs=None)
Compute similarity scores between two image datasets, with optional calibration.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dataset0 |
ImageDataset
|
The first dataset (e.g., query set). |
required |
dataset1 |
ImageDataset
|
The second dataset (e.g., database set). |
required |
pairs |
list of tuples
|
Specific pairs of images to compute similarity scores. If None, compute similarity scores for all pairs. |
None
|
Returns:
Type | Description |
---|---|
ndarray
|
np.ndarray: 2D array of similarity scores between the query and database images.
If |
WildFusion(calibrated_pipelines, priority_pipeline=None)
WildFusion
uses the mean of multiple calibrated SimilarityPipeline
to calculate fused scores.
Since many local feature matching models require deep neural network inference for each query and database pair, the computation quickly becomes infeasible even for moderately sized datasets.
WildFusion can be used with a limited computational budget by applying it only B times per query image. It uses a fast-to-compute similarity score (e.g., cosine similarity of deep features) provided by the priority_pipeline to construct a shortlist of the most promising matches for a given query. Final ranking is then based on WildFusion scores calculated for the pairs in the shortlist.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
calibrated_pipelines |
list[SimilarityPipeline]
|
List of SimilarityPipeline objects. |
required |
priority_pipeline |
SimilarityPipeline
|
Fast-to-compute similarity matcher used for shortlisting. |
None
|
fit_calibration(dataset0, dataset1)
Fit the all calibration models for all matchers in calibrated_pipelines
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dataset0 |
ImageDataset
|
The first dataset (e.g., part of training set). |
required |
dataset1 |
ImageDataset
|
The second dataset (e.g., part of training set). |
required |
__call__(dataset0, dataset1, pairs=None, B=None)
Compute fused similarity scores between two images datasets using multiple calibrated matchers. WildFusion score is is calculated as mean of calibrated similarity scores.
Optionally, to limit the number of pairs to compute, shortlist strategy to select the most promising pairs can be used.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dataset0 |
ImageDataset
|
The first dataset (e.g., query set). |
required |
dataset1 |
ImageDataset
|
The second dataset (e.g., database set). |
required |
pairs |
list of tuples
|
Specific pairs of images to compute similarity scores.
If None, compute similarity scores for all pairs.
Is ignored if |
None
|
B |
int
|
Number of pairs to compute similarity scores for. Required |
None
|
Returns:
Type | Description |
---|---|
np.ndarray: 2D array of similarity scores between the query and database images.
If |
Examples
Example - SimilarityPipeline
We use LightGlue matching with SuperPoint descriptors and keypoints extracted from images resized to 512x512. The scores are calibrated using isotonic regression.
import timm
import torchvision.transforms as T
from wildlife_tools.features import SuperPointExtractor
from wildlife_tools.similarity import MatchLightGlue
from wildlife_tools.similarity.wildfusion import SimilarityPipeline
from wildlife_tools.similarity.calibration import IsotonicCalibration
pipeline = SimilarityPipeline(
matcher = MatchLightGlue(features='superpoint'),
extractor = SuperPointExtractor(),
transform = T.Compose([
T.Resize([512, 512]),
T.ToTensor()
]),
calibration = IsotonicCalibration()
),
pipeline.fit_calibration(calibration_dataset1, calibration_dataset2)
scores = pipeline(query, database)
Example - WildFusion
import timm
import torchvision.transforms as T
from wildlife_tools.features import *
from wildlife_tools.similarity import CosineSimilarity, MatchLOFTR, MatchLightGlue
from wildlife_tools.similarity.wildfusion import SimilarityPipeline, WildFusion
from wildlife_tools.similarity.calibration import IsotonicCalibration
matchers = [
SimilarityPipeline(
matcher = MatchLightGlue(features='superpoint'),
extractor = SuperPointExtractor(),
transform = T.Compose([
T.Resize([512, 512]),
T.ToTensor()
]),
calibration = IsotonicCalibration()
),
SimilarityPipeline(
matcher = MatchLightGlue(features='aliked'),
extractor = AlikedExtractor(),
transform = T.Compose([
T.Resize([512, 512]),
T.ToTensor()
]),
calibration = IsotonicCalibration()
),
SimilarityPipeline(
matcher = MatchLightGlue(features='disk'),
extractor = DiskExtractor(),
transform = T.Compose([
T.Resize([512, 512]),
T.ToTensor()
]),
calibration = IsotonicCalibration()
),
SimilarityPipeline(
matcher = MatchLightGlue(features='sift'),
extractor = SiftExtractor(),
transform = T.Compose([
T.Resize([512, 512]),
T.ToTensor()
]),
calibration = IsotonicCalibration()
),
SimilarityPipeline(
matcher = MatchLOFTR(pretrained='outdoor'),
extractor = None,
transform = T.Compose([
T.Resize([512, 512]),
T.Grayscale(),
T.ToTensor(),
]),
calibration = IsotonicCalibration()
),
SimilarityPipeline(
matcher = CosineSimilarity(),
extractor = DeepFeatures(
model = timm.create_model(
'hf-hub:BVRA/wildlife-mega-L-384',
num_classes=0,
pretrained=True
)
),
transform = T.Compose([
T.Resize(size=(384, 384)),
T.ToTensor(),
T.Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225)),
]),
calibration = IsotonicCalibration()
),
]
wildfusion = WildFusion(calibrated_matchers = matchers)
wildfusion.fit_calibration(calibration_dataset1, calibration_dataset2)
similarity = wildfusion(query, database)
Example - WildFusion with shortlist
Cosine similarity of MegaDescriptor features is used to construct the shortlist. Then, after calibration, WildFusion is run with a budget of 100 score calculations per query image.
priority_matcher = SimilarityPipeline(
matcher = CosineSimilarity(),
extractor = DeepFeatures(
model = timm.create_model(
'hf-hub:BVRA/wildlife-mega-L-384',
num_classes=0,
pretrained=True
)
),
transform = T.Compose([
T.Resize(size=(384, 384)),
T.ToTensor(),
T.Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225)),
]),
)
wildfusion = WildFusion(calibrated_matchers = matchers)
wildfusion.fit_calibration(calibration_dataset1, calibration_dataset2)
similarity = wildfusion(query, database, B=100)