Skip to content

Reference utils

This file describes methods associated with dataset analysis and loading.

Analysis

compute_span(df, col_label='identity')

Compute the time span of the dataset.

The span is defined as the latest time minus the earliest time of image taken. The times are computed separately for each individual.

Parameters:

Name Type Description Default
df DataFrame

A full dataframe of the data.

required
col_label str

Column name containing individual animal names (labels).

'identity'

Returns:

Type Description
float

The span of the dataset in seconds.

display_statistics(df, unknown_name='', col_label='identity')

Prints statistics about the dataframe.

Parameters:

Name Type Description Default
df DataFrame

A full dataframe of the data.

required
unknown_name str

Name of the unknown class.

''
col_label str

Column name containing individual animal names (labels).

'identity'

Loading

get_dataframe_path(root_dataframe, class_dataset)

Creates path to the pickled dataframe.

Parameters:

Name Type Description Default
root_dataframe str

Path where all dataframes are stored.

required
class_dataset type

Type of WildlifeDataset.

required

Returns:

Type Description
str

Path to the dataframe.

get_dataset_folder(root_dataset, class_dataset)

Creates path to the dataset data.

Parameters:

Name Type Description Default
root_dataset str

Path where all datasets are stored.

required
class_dataset type

Type of WildlifeDataset.

required

Returns:

Type Description
str

Path to the stored data.

load_dataset(class_dataset, root_dataset, root_dataframe, overwrite=False, **kwargs)

Loads dataset from a pickled dataframe or creates it.

If the dataframe is already saved in a pkl file, it loads it. Otherwise, it creates the dataframe and saves it in a pkl file.

Parameters:

Name Type Description Default
class_dataset type

Type of WildlifeDataset to load.

required
root_dataset str

Path where all datasets are stored.

required
root_dataframe str

Path where all dataframes are stored.

required
overwrite bool

Whether the pickled dataframe should be overwritten.

False

Returns:

Type Description
WildlifeDataset

The loaded dataset.

load_datasets(class_datasets, root_dataset, root_dataframe, **kwargs)

Loads multiple datasets as described in load_dataset.

Parameters:

Name Type Description Default
class_datasets List[type]

List of types of WildlifeDataset to download.

required
root_dataset str

Path where all datasets are stored.

required
root_dataframe str

Path where all dataframes are stored.

required

Returns:

Type Description
list[WildlifeDataset]

The list of loaded datasets.