Tools

Shared utilities used across pipeline stages.

Label handling

Unified syllable label handling for the song phenotyping pipeline.

All pipeline stages that read or write syllable labels should import LabelType and LabelHandler from here so that label normalisation is consistent end-to-end.

Label conventions

Manual labels

Single characters 'a''z', with 's' as the song-start token and 'z' as the song-end token.

Auto (HDBSCAN) labels

Integers produced by HDBSCAN clustering, with -5 as the song-start token and -3 as the song-end token.

class song_phenotyping.tools.label_handler.LabelHandler(label_type)[source]

Bases: object

Normalise and tokenise syllable labels for a given LabelType.

Use this wherever labels are read from HDF5 files or passed between pipeline stages so that manual and automated labels are handled identically.

Parameters:

label_type (LabelType) – Whether labels are human-annotated (LabelType.MANUAL) or HDBSCAN-generated (LabelType.AUTO).

Examples

>>> handler = LabelHandler(LabelType.MANUAL)
>>> handler.normalize_labels([b'a', b's', b'b'])
['a', 's', 'b']
>>> handler.add_sequence_tokens(['a', 'b'])
['s', 'a', 'b', 'z']
add_sequence_tokens(labels)[source]

Wrap a label sequence with song-boundary tokens.

Parameters:

labels (list of str or int) – Syllable labels for a single song, without boundary tokens.

Returns:

[start_token, *labels, end_token].

Return type:

list of str or int

property end_token: str | int

Song-end boundary token ('z' for manual, -3 for auto).

property non_syl_tokens: List[str | int]

Tokens that mark song boundaries rather than syllable identity.

Returns:

['s', 'z', '\r'] for manual labels; [-5, -3] for auto labels.

Return type:

list of str or int

normalize_labels(raw_labels)[source]

Convert raw labels (possibly bytes) to a consistent Python type.

Parameters:

raw_labels (list) – Labels as read from HDF5 — may be bytes, numpy.bytes_, str, or int.

Returns:

String labels for LabelType.MANUAL; integer labels for LabelType.AUTO.

Return type:

list of str or int

property start_token: str | int

Song-start boundary token ('s' for manual, -5 for auto).

class song_phenotyping.tools.label_handler.LabelType(value)[source]

Bases: Enum

Enumeration of syllable label sources.

MANUAL

Human-annotated labels stored as single characters.

Type:

str

AUTO

Automated labels produced by HDBSCAN clustering.

Type:

str

AUTO = 'hdbscan'
MANUAL = 'manual'
song_phenotyping.tools.label_handler.has_manual_labels(syllable_data)[source]

Return True if syllable_data contains non-empty manual labels.

Parameters:

syllable_data (dict) – Dictionary returned by the syllable-data loader, expected to contain the key 'manual_syllables'.

Returns:

True when syllable_data['manual_syllables'] is present and non-empty; False otherwise.

Return type:

bool

Spectrogram parameters

Parameter dataclasses for spectrogram computation.

class song_phenotyping.tools.spectrogram_configs.SpectrogramParams(nfft=512, hop=1, target_shape=(257, 300), min_freq=400.0, max_freq=10000.0, max_dur=0.08, fs=32000.0, padding=0.0, slice_length=None, songs_per_bird=30, overwrite_existing=False, use_warping=True, downsample=True, save_inst_freq=False, save_group_delay=False, duration_feature_weight=0.0, warp_freq_sum=True, shift_lambdas=<factory>, slope_lambdas=<factory>)[source]

Bases: object

Parameters controlling spectrogram computation and syllable extraction.

Parameters:
  • nfft (int, optional) – FFT window size in samples. Determines frequency resolution. Default is 1024.

  • hop (int, optional) – Hop size between successive FFT windows in samples. Default is 1.

  • target_shape (tuple of int, optional) – (n_freq_bins, n_time_bins) to which each syllable spectrogram is resized. Defaults to (nfft // 2 + 1, 300).

  • min_freq (float, optional) – Low-frequency cutoff in Hz applied when cropping spectrograms. Default is 200.0.

  • max_freq (float, optional) – High-frequency cutoff in Hz. Default is 15000.0.

  • max_dur (float or None, optional) – Maximum syllable duration in seconds. Syllables longer than this are discarded. None disables the duration filter. Default is 0.080.

  • fs (float, optional) – Expected audio sample rate in Hz. Default is 32000.0.

  • padding (float, optional) – Padding added around each syllable boundary in seconds. Default is 0.0.

  • slice_length (float or None, optional) – Fixed window length in milliseconds for slice-based extraction (Stage A1). None disables fixed-window mode. Default is None.

  • songs_per_bird (int, optional) – Maximum number of songs to process per bird when using SpectrogramParams as the sole source of this limit. Default is 5.

  • overwrite_existing (bool, optional) – If True, recompute and overwrite previously saved spectrogram files. Default is False.

  • use_warping (bool, optional) – Apply dynamic time-warping alignment before saving. Default is False.

  • downsample (bool, optional) – Downsample audio to fs before computing spectrograms. Default is False.

  • save_inst_freq (bool, optional) – Compute and store instantaneous frequency (IF) alongside the magnitude spectrogram. IF is the temporal derivative of unwrapped phase, capturing pitch-modulation structure. Shape per syllable: (n_freq, n_time - 1) = (513, 299) with default settings. Default is False.

  • save_group_delay (bool, optional) – Compute and store group delay (GD) alongside the magnitude spectrogram. GD is the negative frequency-derivative of unwrapped phase, capturing spectral dispersion. Shape per syllable: (n_freq - 1, n_time) = (512, 300) with default settings. Default is False.

  • duration_feature_weight (float, optional) – When non-zero, the normalised syllable duration (relative to max_dur) is tiled into n_freq extra features and appended to the flattened feature vector at Stage B. Set to a value roughly comparable to the magnitude feature weight (experiment to tune). 0.0 (default) disables the feature entirely.

  • warp_freq_sum (bool, optional) – Sum over frequency bins before DTW (deprecated). Default is True.

  • shift_lambdas (list of float, optional) – DTW shift penalty grid (deprecated). Default is [100, 10, 1, 0].

  • slope_lambdas (list of float, optional) – DTW slope penalty grid (deprecated). Default is [inf, inf, 10, 1].

Raises:

ValueError – If slice_length is not positive, or songs_per_bird is not positive.

Examples

Default parameters for syllable-based extraction:

>>> params = SpectrogramParams()

Smoke-test configuration (3 songs, no warping):

>>> params = SpectrogramParams(songs_per_bird=3)

Fixed-window slice extraction at 50 ms:

>>> params = SpectrogramParams(slice_length=50.0)
downsample: bool = True
duration_feature_weight: float = 0.0

Weight for duration token appended to flattened feature vector. Zero (default) disables the feature entirely.

from_dict(params_dict)[source]

Update fields in-place from a dictionary.

Parameters:

params_dict (dict) – Keys matching field names on this dataclass; unrecognised keys are silently ignored. After updating, validate_params() is called.

fs: float = 32000.0
hop: int = 1
classmethod load_from_hdf5(hdf5_path)[source]

Load parameters from an HDF5 file written by save_to_hdf5().

Parameters:

hdf5_path (str) – Path to the HDF5 file.

Returns:

New instance populated from the stored values.

Return type:

SpectrogramParams

max_dur: float | None = 0.08
max_freq: float = 10000.0
min_freq: float = 400.0
nfft: int = 512
overwrite_existing: bool = False
padding: float = 0.0
run_hash()[source]

Return an 8-character hex digest of the parameters that affect Stage A output.

Used to auto-generate a unique run name when RUN_NAME is not set in run_pipeline.py. Different param combinations produce different hashes, ensuring stale HDF5 files are never silently reused.

Return type:

str

save_group_delay: bool = False

Save group delay alongside magnitude spectrograms.

save_inst_freq: bool = False

Save instantaneous frequency alongside magnitude spectrograms.

save_to_hdf5(hdf5_path)[source]

Persist parameters to an HDF5 file.

Parameters:

hdf5_path (str) – Destination file path. The parameters are stored under the key spectrogram_params.

shift_lambdas: List[float]
slice_length: float | None = None
slope_lambdas: List[float]
songs_per_bird: int = 30
target_shape: tuple = (257, 300)
to_dict()[source]

Return core parameters as a plain dictionary.

Returns:

Keys: nfft, hop, target_shape, min_freq, max_freq, max_dur, fs, padding.

Return type:

dict

use_warping: bool = True
validate_params()[source]

Raise ValueError if any parameter is out of range.

Checks nfft, hop, min_freq, max_freq, max_dur, fs, and padding.

warp_freq_sum: bool = True

Run configuration and registry

Experiment metadata tracking for the song phenotyping pipeline.

RunConfig captures the full parameter set for one pipeline run (spectrogram generation through phenotyping) so that any output file can be traced back to the exact inputs and settings that produced it.

RunRegistry is a thin SQLite wrapper for storing and querying RunConfig records across runs.

Examples

Create and save a run record:

>>> cfg = RunConfig.create(spec_mode="syllable", bird_ids=["or18or24"])
>>> cfg.save("run_or18or24.json")

Register in the project database and query later:

>>> registry = RunRegistry("db.sqlite3")
>>> registry.register(cfg)
>>> df = registry.query(bird_id="or18or24")
class song_phenotyping.tools.run_config.RunConfig(run_id, created_at, spec_mode, bird_ids, spec_params, umap_params, hdbscan_params, phenotype_params, notes='')[source]

Bases: object

Complete parameter record for one pipeline run.

One RunConfig is created at the start of a run (or reconstructed from a saved JSON file). The run_id is intended to be embedded in every HDF5/pickle output file so results can always be traced back to the exact settings that produced them.

Parameters:
  • run_id (str) – UUID-4 string that uniquely identifies this run.

  • created_at (str) – ISO 8601 timestamp (UTC) of when the run was created.

  • spec_mode (str) – "syllable" for Stage A syllable-based extraction or "slice" for Stage A1 fixed-window extraction.

  • bird_ids (list of str) – Bird identifiers processed in this run.

  • spec_params (dict) – Serialised SpectrogramParams.

  • umap_params (dict) – Serialised UMAP parameter object.

  • hdbscan_params (dict) – Serialised HDBSCANParams (winning configuration after grid search).

  • phenotype_params (dict) – Serialised PhenotypingConfig.

  • notes (str, optional) – Free-text annotation. Default is "".

See also

RunConfig.create

Factory method that accepts typed config objects.

RunRegistry

SQLite store for querying runs across parameter settings.

bird_ids: List[str]
classmethod create(spec_mode, bird_ids, spec_params=None, umap_params=None, hdbscan_params=None, phenotype_params=None, notes='')[source]

Create a new RunConfig with a fresh UUID and timestamp.

Parameters:
  • spec_mode (str) – "syllable" or "slice".

  • bird_ids (list of str) – Birds to associate with this run.

  • spec_params (SpectrogramParams, optional) – Defaults to SpectrogramParams.

  • umap_params (UMAPParams, optional) – Defaults to UMAPParams().

  • hdbscan_params (HDBSCANParams, optional) – Defaults to HDBSCANParams().

  • phenotype_params (PhenotypingConfig, optional) – Defaults to PhenotypingConfig().

  • notes (str, optional) – Free-text annotation.

Returns:

New instance with a generated run_id.

Return type:

RunConfig

created_at: str
hdbscan_params: dict
property hdbscan_params_obj

Return hdbscan_params as a HDBSCANParams instance.

classmethod load(path)[source]

Deserialise a RunConfig from a JSON file.

Parameters:

path (str) – Path to a JSON file written by save().

Returns:

Reconstructed instance with range objects restored.

Return type:

RunConfig

notes: str = ''
phenotype_params: dict
property phenotype_params_obj

Return phenotype_params as a PhenotypingConfig instance.

run_id: str
save(path)[source]

Serialise this RunConfig to a JSON file.

Parameters:

path (str) – Destination file path.

spec_mode: str
spec_params: dict
property spec_params_obj: SpectrogramParams

Return spec_params as a SpectrogramParams instance.

to_dict()[source]

Return all fields as a JSON-serialisable dictionary.

Returns:

All RunConfig fields as plain Python types.

Return type:

dict

umap_params: dict
property umap_params_obj

Return umap_params as a UMAPParams instance.

class song_phenotyping.tools.run_config.RunRegistry(db_path)[source]

Bases: object

SQLite-backed store for RunConfig records.

Parameters:

db_path (str) – Path to the SQLite database file. Created automatically if it does not exist.

Examples

>>> registry = RunRegistry("db.sqlite3")
>>> registry.register(run_config)
>>> df = registry.query(spec_mode="syllable")
>>> df = registry.query(bird_id="or18or24")
all()[source]

Return all registered runs as a DataFrame.

Returns:

Every run in the registry, ordered by created_at descending.

Return type:

pandas.DataFrame

get(run_id)[source]

Retrieve a single RunConfig by run_id.

Parameters:

run_id (str) – UUID of the run to retrieve.

Returns:

The matching record, or None if not found.

Return type:

RunConfig or None

query(bird_id=None, **kwargs)[source]

Return a DataFrame of matching runs.

Parameters:
  • bird_id (str, optional) – Filter to runs whose bird_ids list contains this identifier (substring match on the serialised JSON).

  • **kwargs

    Exact-match filters on top-level columns: spec_mode, run_id, notes. To filter on nested parameters (e.g. n_neighbors=20), filter the returned DataFrame directly:

    df[df['umap_params'].apply(lambda p: p['n_neighbors']) == 20]
    

Returns:

One row per matching run, ordered by created_at descending. The spec_params, umap_params, hdbscan_params, and phenotype_params columns contain plain dicts.

Return type:

pandas.DataFrame

register(config)[source]

Insert a RunConfig. Silently replaces if run_id exists.

Parameters:

config (RunConfig) – Run record to store.

Project configuration

Machine-local path configuration for the song phenotyping pipeline.

Reads config.yaml from the project root (or a path you specify). Each machine keeps its own config.yaml (gitignored); config.yaml.example is committed as a template.

Examples

>>> from song_phenotyping.tools.project_config import ProjectConfig
>>> cfg = ProjectConfig.load()
>>> # Resolve a bird directory on the local cache drive
>>> bird_path = cfg.bird_dir('or18or24', experiment='evsong test')
>>> # → /Volumes/Extreme SSD/evsong test/or18or24
>>> # Get the Macaw server root (auto-detected if not set in config)
>>> macaw = cfg.macaw_root
>>> # Access pipeline run settings
>>> pipe = cfg.pipeline
>>> pipe.save_path        # where outputs are written
>>> pipe.evsong_source    # parent dir of evsonganaly bird folders
>>> pipe.birds            # None = all, or list of bird IDs to process
class song_phenotyping.tools.project_config.PipelineConfig(save_path, evsong_source, wseg_metadata, birds, songs_per_bird, songs_seed, copy_locally, wav_root, spectrogram_params, embedding_params, labelling_params, phenotyping_params, generate_catalog)[source]

Bases: object

Pipeline run settings loaded from the pipeline: block in config.yaml.

All fields have sensible defaults so that a minimal config.yaml (one that only sets paths:) still works — the pipeline will fall back to writing outputs next to the source data and processing all discovered birds.

Sub-section dicts (spectrogram_params, embedding_params, labelling_params, phenotyping_params) are kept as plain dicts so each stage can merge them into its own typed dataclass.

Parameters:
  • save_path (Path | None)

  • evsong_source (Path | None)

  • wseg_metadata (Path | None)

  • birds (List[str] | None)

  • songs_per_bird (int | None)

  • songs_seed (int | None)

  • copy_locally (bool)

  • wav_root (Path | None)

  • spectrogram_params (dict | None)

  • embedding_params (dict | None)

  • labelling_params (dict | None)

  • phenotyping_params (dict | None)

  • generate_catalog (bool)

birds: List[str] | None
copy_locally: bool
embedding_params: dict | None
classmethod empty()[source]

Return a default PipelineConfig for use when config.yaml lacks a pipeline: block.

Return type:

PipelineConfig

evsong_source: Path | None
classmethod from_dict(d)[source]
Parameters:

d (dict)

Return type:

PipelineConfig

generate_catalog: bool
labelling_params: dict | None
phenotyping_params: dict | None
save_path: Path | None
songs_per_bird: int | None
songs_seed: int | None
spectrogram_params: dict | None
wav_root: Path | None
wseg_metadata: Path | None
class song_phenotyping.tools.project_config.ProjectConfig(local_cache, macaw_root, run_registry, pipeline)[source]

Bases: object

Machine-local path settings loaded from config.yaml.

Parameters:
bird_dir(bird_id, experiment=None)[source]

Return the local cache directory for a bird.

Parameters:
  • bird_id (str) – Bird identifier, e.g. ‘or18or24’.

  • experiment (str, optional) – Experiment subdirectory name, e.g. ‘evsong test’ or ‘wseg test’. If None, returns local_cache / bird_id directly.

Return type:

Path

Examples

cfg.bird_dir(‘or18or24’, ‘evsong test’) # → /Volumes/Extreme SSD/evsong test/or18or24

classmethod load(config_path=None)[source]

Load config from a YAML file.

If config_path is None, searches upward from cwd for config.yaml, then falls back to config.yaml.example.

Raises FileNotFoundError if neither file is found. Raises ImportError if PyYAML is not installed.

Parameters:

config_path (str | Path | None)

Return type:

ProjectConfig

local_cache: Path
macaw_bird_dir(*parts)[source]

Join parts onto macaw_root, or return None if macaw is not mounted.

Example: cfg.macaw_bird_dir(‘annietaylor’, ‘x-foster’)

Parameters:

parts (str)

Return type:

Path | None

macaw_root: Path | None
pipeline: PipelineConfig
run_registry: Path

Pipeline paths

Central path constants for the song phenotyping pipeline output tree.

All pipeline modules import from here so that directory names can be changed in one place. Use stage_path() to build absolute paths.

Output tree layout

<save_path>/<bird>/
    <run_name>/              ← e.g. "d5dfde49" (SHA256[:8] of config)
        stages/
            01_specs/        ← Stage A  spectrogram HDF5 files
            02_features/     ← Stage B  flattened feature HDF5 files
            03_embeddings/   ← Stage C  UMAP HDF5 + pkl models
            04_labels/       ← Stage D  cluster label HDF5 files
            syllable_database/  ← syllable_features.{csv,h5} + feature_params.json
            05_phenotype/    ← Stage E  detailed phenotype pkl files
        results/
            master_summary.csv
            phenotype_results.csv
            run_config.json
            catalog/         ← HTML song catalogs
            plots/           ← PDFs / images

Each unique combination of computational parameters produces a different <run_name> directory, isolating run artifacts completely. The stages/ subtree contains internal computation artifacts; results/ contains outputs intended for human inspection.

song_phenotyping.tools.pipeline_paths.run_root(bird_root, run_name)[source]

Return <bird_root>/<run_name> as a Path.

Parameters:
  • bird_root – Root directory for a single bird, e.g. /data/pipeline_runs/or18or24.

  • run_name (str) – Human-readable or hash-derived run identifier, e.g. "baseline" or "a1b2c3d4".

Return type:

Path

Example

>>> run_root("/data/or18or24", "baseline")
PosixPath('/data/or18or24/baseline')
song_phenotyping.tools.pipeline_paths.run_stage_path(bird_root, run_name, subdir)[source]

Return <bird_root>/<run_name>/<subdir> as a Path.

Parameters:
  • bird_root – Root directory for a single bird.

  • run_name (str) – Run identifier (see run_root()).

  • subdir (str) – One of the stage *_DIR constants, e.g. SPECS_DIR.

Return type:

Path

Example

>>> run_stage_path("/data/or18or24", "baseline", SPECS_DIR)
PosixPath('/data/or18or24/baseline/stages/01_specs')
song_phenotyping.tools.pipeline_paths.stage_path(bird_root, subdir)[source]

Return bird_root / subdir as an absolute Path.

Parameters:
  • bird_root (str | Path) – Root directory for a single bird, e.g. E:/pipeline_runs/or18or24.

  • subdir (str) – One of the *_DIR constants defined in this module, or any relative path string.

Return type:

Path

Example

>>> from song_phenotyping.tools.pipeline_paths import stage_path, SPECS_DIR
>>> stage_path("/data/pipeline_runs/or18or24", SPECS_DIR)
PosixPath('/data/pipeline_runs/or18or24/stages/01_specs')

File records

class song_phenotyping.tools.filerecords.FileRecord(metadata_path, audio_path=None, audio_is_local=False, server_audio_path=None, bird=None, day=None, time=None, parse_source=None, extras=<factory>)[source]

Bases: object

Resolved paths and parsed metadata for a single audio recording.

Populated by audio_paths_txt_to_filerecords() or constructed directly. Holds the canonical metadata path, the resolved audio path (local copy or server path), and parsed filename fields (bird, day, time).

Parameters:
  • metadata_path (Path)

  • audio_path (Path | None)

  • audio_is_local (bool)

  • server_audio_path (Path | None)

  • bird (str | None)

  • day (str | None)

  • time (str | None)

  • parse_source (str | None)

  • extras (Dict[str, Any])

audio_is_local: bool = False
audio_path: Path | None = None
basename_key()[source]

A stable ‘bird_day_time’ key if available, else fallback to metadata basename.

Return type:

str

bird: str | None = None
day: str | None = None
extras: Dict[str, Any]
classmethod from_paths(metadata_path, audio_path=None, **kwargs)[source]

Convenience constructor accepting strings.

Parameters:
  • metadata_path (str)

  • audio_path (str | None)

metadata_path: Path
parse_source: str | None = None
server_audio_path: Path | None = None
time: str | None = None
to_dict()[source]

Convert to plain dict with string paths (safe for JSON/logs).

Return type:

Dict[str, Any]

song_phenotyping.tools.filerecords.audio_paths_txt_to_filerecords(audio_paths_txt, metadata_ext='.wav.not.mat', bird_subset=None)[source]

Read audio_paths.txt lines of the form bird_id|local_path|server_path and return Dict[bird_id, List[FileRecord]].

Parameters:
  • audio_paths_txt (str) – Path to the audio_paths.txt file.

  • metadata_ext (str) – Extension used to derive the metadata filename from an audio path.

  • bird_subset (list of str, optional) – If given, only records for these bird IDs are returned.

Return type:

Dict[str, List[FileRecord]]