Tools

Shared utilities used across pipeline stages.

Label handling

Unified syllable label handling for the song phenotyping pipeline.

All pipeline stages that read or write syllable labels should import LabelType and LabelHandler from here so that label normalisation is consistent end-to-end.

Label conventions

Manual labels: Single characters 'a'–'z', with 's' as the song-start token and 'z' as the song-end token.
Auto (HDBSCAN) labels: Integers produced by HDBSCAN clustering, with -5 as the song-start token and -3 as the song-end token.

class song_phenotyping.tools.label_handler.LabelHandler(label_type)[source]

Bases: object

Normalise and tokenise syllable labels for a given LabelType.

Use this wherever labels are read from HDF5 files or passed between pipeline stages so that manual and automated labels are handled identically.

Parameters:: label_type (LabelType) – Whether labels are human-annotated (LabelType.MANUAL) or HDBSCAN-generated (LabelType.AUTO).

Examples

>>> handler = LabelHandler(LabelType.MANUAL)
>>> handler.normalize_labels([b'a', b's', b'b'])
['a', 's', 'b']
>>> handler.add_sequence_tokens(['a', 'b'])
['s', 'a', 'b', 'z']

add_sequence_tokens(labels)[source]

Wrap a label sequence with song-boundary tokens.

Parameters:: labels (list of str or int) – Syllable labels for a single song, without boundary tokens.
Returns:: [start_token, *labels, end_token].
Return type:: list of str or int

property end_token: str | int: Song-end boundary token ('z' for manual, -3 for auto).

property non_syl_tokens: List[str | int]

Tokens that mark song boundaries rather than syllable identity.

Returns:: ['s', 'z', '\r'] for manual labels; [-5, -3] for auto labels.
Return type:: list of str or int

normalize_labels(raw_labels)[source]

Convert raw labels (possibly bytes) to a consistent Python type.

Parameters:: raw_labels (list) – Labels as read from HDF5 — may be bytes, numpy.bytes_, str, or int.
Returns:: String labels for LabelType.MANUAL; integer labels for LabelType.AUTO.
Return type:: list of str or int

property start_token: str | int: Song-start boundary token ('s' for manual, -5 for auto).

class song_phenotyping.tools.label_handler.LabelType(value)[source]

Bases: Enum

Enumeration of syllable label sources.

MANUAL

Human-annotated labels stored as single characters.

Type:: str

AUTO

Automated labels produced by HDBSCAN clustering.

Type:: str

AUTO = 'hdbscan'

MANUAL = 'manual'

song_phenotyping.tools.label_handler.has_manual_labels(syllable_data)[source]

Return True if syllable_data contains non-empty manual labels.

Parameters:: syllable_data (dict) – Dictionary returned by the syllable-data loader, expected to contain the key 'manual_syllables'.
Returns:: True when syllable_data['manual_syllables'] is present and non-empty; False otherwise.
Return type:: bool

Spectrogram parameters

Parameter dataclasses for spectrogram computation.

class song_phenotyping.tools.spectrogram_configs.SpectrogramParams(nfft=512, hop=1, target_shape=(257, 300), min_freq=400.0, max_freq=10000.0, max_dur=0.08, fs=32000.0, padding=0.0, slice_length=None, songs_per_bird=30, overwrite_existing=False, use_warping=True, downsample=True, save_inst_freq=False, save_group_delay=False, duration_feature_weight=0.0, warp_freq_sum=True, shift_lambdas=<factory>, slope_lambdas=<factory>)[source]

Bases: object

Parameters controlling spectrogram computation and syllable extraction.

Parameters:

nfft (int, optional) – FFT window size in samples. Determines frequency resolution. Default is 1024.
hop (int, optional) – Hop size between successive FFT windows in samples. Default is 1.
target_shape (tuple of int, optional) – (n_freq_bins, n_time_bins) to which each syllable spectrogram is resized. Defaults to (nfft // 2 + 1, 300).
min_freq (float, optional) – Low-frequency cutoff in Hz applied when cropping spectrograms. Default is 200.0.
max_freq (float, optional) – High-frequency cutoff in Hz. Default is 15000.0.
max_dur (float or None, optional) – Maximum syllable duration in seconds. Syllables longer than this are discarded. None disables the duration filter. Default is 0.080.
fs (float, optional) – Expected audio sample rate in Hz. Default is 32000.0.
padding (float, optional) – Padding added around each syllable boundary in seconds. Default is 0.0.
slice_length (float or None, optional) – Fixed window length in milliseconds for slice-based extraction (Stage A1). None disables fixed-window mode. Default is None.
songs_per_bird (int, optional) – Maximum number of songs to process per bird when using SpectrogramParams as the sole source of this limit. Default is 5.
overwrite_existing (bool, optional) – If True, recompute and overwrite previously saved spectrogram files. Default is False.
use_warping (bool, optional) – Apply dynamic time-warping alignment before saving. Default is False.
downsample (bool, optional) – Downsample audio to fs before computing spectrograms. Default is False.
save_inst_freq (bool, optional) – Compute and store instantaneous frequency (IF) alongside the magnitude spectrogram. IF is the temporal derivative of unwrapped phase, capturing pitch-modulation structure. Shape per syllable: (n_freq, n_time - 1) = (513, 299) with default settings. Default is False.
save_group_delay (bool, optional) – Compute and store group delay (GD) alongside the magnitude spectrogram. GD is the negative frequency-derivative of unwrapped phase, capturing spectral dispersion. Shape per syllable: (n_freq - 1, n_time) = (512, 300) with default settings. Default is False.
duration_feature_weight (float, optional) – When non-zero, the normalised syllable duration (relative to max_dur) is tiled into n_freq extra features and appended to the flattened feature vector at Stage B. Set to a value roughly comparable to the magnitude feature weight (experiment to tune). 0.0 (default) disables the feature entirely.
warp_freq_sum (bool, optional) – Sum over frequency bins before DTW (deprecated). Default is True.
shift_lambdas (list of float, optional) – DTW shift penalty grid (deprecated). Default is [100, 10, 1, 0].
slope_lambdas (list of float, optional) – DTW slope penalty grid (deprecated). Default is [inf, inf, 10, 1].

Raises:

ValueError – If slice_length is not positive, or songs_per_bird is not positive.

Examples

Default parameters for syllable-based extraction:

>>> params = SpectrogramParams()

Smoke-test configuration (3 songs, no warping):

>>> params = SpectrogramParams(songs_per_bird=3)

Fixed-window slice extraction at 50 ms:

>>> params = SpectrogramParams(slice_length=50.0)

downsample: bool = True

duration_feature_weight: float = 0.0: Weight for duration token appended to flattened feature vector. Zero (default) disables the feature entirely.

from_dict(params_dict)[source]

Update fields in-place from a dictionary.

Parameters:: params_dict (dict) – Keys matching field names on this dataclass; unrecognised keys are silently ignored. After updating, validate_params() is called.

fs: float = 32000.0

hop: int = 1

classmethod load_from_hdf5(hdf5_path)[source]

Load parameters from an HDF5 file written by save_to_hdf5().

Parameters:: hdf5_path (str) – Path to the HDF5 file.
Returns:: New instance populated from the stored values.
Return type:: SpectrogramParams

max_dur: float | None = 0.08

max_freq: float = 10000.0

min_freq: float = 400.0

nfft: int = 512

overwrite_existing: bool = False

padding: float = 0.0

run_hash()[source]

Return an 8-character hex digest of the parameters that affect Stage A output.

Used to auto-generate a unique run name when RUN_NAME is not set in run_pipeline.py. Different param combinations produce different hashes, ensuring stale HDF5 files are never silently reused.

Return type:: str

save_group_delay: bool = False: Save group delay alongside magnitude spectrograms.

save_inst_freq: bool = False: Save instantaneous frequency alongside magnitude spectrograms.

save_to_hdf5(hdf5_path)[source]

Persist parameters to an HDF5 file.

Parameters:: hdf5_path (str) – Destination file path. The parameters are stored under the key spectrogram_params.

shift_lambdas: List[float]

slice_length: float | None = None

slope_lambdas: List[float]

songs_per_bird: int = 30

target_shape: tuple = (257, 300)

to_dict()[source]

Return core parameters as a plain dictionary.

Returns:: Keys: nfft, hop, target_shape, min_freq, max_freq, max_dur, fs, padding.
Return type:: dict

use_warping: bool = True

validate_params()[source]

Raise ValueError if any parameter is out of range.

Checks nfft, hop, min_freq, max_freq, max_dur, fs, and padding.

warp_freq_sum: bool = True

Run configuration and registry

Experiment metadata tracking for the song phenotyping pipeline.

RunConfig captures the full parameter set for one pipeline run (spectrogram generation through phenotyping) so that any output file can be traced back to the exact inputs and settings that produced it.

RunRegistry is a thin SQLite wrapper for storing and querying RunConfig records across runs.

Examples

Create and save a run record:

>>> cfg = RunConfig.create(spec_mode="syllable", bird_ids=["or18or24"])
>>> cfg.save("run_or18or24.json")

Register in the project database and query later:

>>> registry = RunRegistry("db.sqlite3")
>>> registry.register(cfg)
>>> df = registry.query(bird_id="or18or24")

class song_phenotyping.tools.run_config.RunConfig(run_id, created_at, spec_mode, bird_ids, spec_params, umap_params, hdbscan_params, phenotype_params, notes='')[source]

Bases: object

Complete parameter record for one pipeline run.

One RunConfig is created at the start of a run (or reconstructed from a saved JSON file). The run_id is intended to be embedded in every HDF5/pickle output file so results can always be traced back to the exact settings that produced them.

Parameters:

run_id (str) – UUID-4 string that uniquely identifies this run.
created_at (str) – ISO 8601 timestamp (UTC) of when the run was created.
spec_mode (str) – "syllable" for Stage A syllable-based extraction or "slice" for Stage A1 fixed-window extraction.
bird_ids (list of str) – Bird identifiers processed in this run.
spec_params (dict) – Serialised SpectrogramParams.
umap_params (dict) – Serialised UMAP parameter object.
hdbscan_params (dict) – Serialised HDBSCANParams (winning configuration after grid search).
phenotype_params (dict) – Serialised PhenotypingConfig.
notes (str, optional) – Free-text annotation. Default is "".

Project configuration

Machine-local path configuration for the song phenotyping pipeline.

Reads config.yaml from the project root (or a path you specify). Each machine keeps its own config.yaml (gitignored); config.yaml.example is committed as a template.

Examples

>>> from song_phenotyping.tools.project_config import ProjectConfig
>>> cfg = ProjectConfig.load()

>>> # Resolve a bird directory on the local cache drive
>>> bird_path = cfg.bird_dir('or18or24', experiment='evsong test')
>>> # → /Volumes/Extreme SSD/evsong test/or18or24

>>> # Get the Macaw server root (auto-detected if not set in config)
>>> macaw = cfg.macaw_root

>>> # Access pipeline run settings
>>> pipe = cfg.pipeline
>>> pipe.save_path        # where outputs are written
>>> pipe.evsong_source    # parent dir of evsonganaly bird folders
>>> pipe.birds            # None = all, or list of bird IDs to process

class song_phenotyping.tools.project_config.PipelineConfig(save_path, evsong_source, wseg_metadata, birds, songs_per_bird, songs_seed, copy_locally, wav_root, spectrogram_params, embedding_params, labelling_params, phenotyping_params, generate_catalog)[source]

Bases: object

Pipeline run settings loaded from the pipeline: block in config.yaml.

All fields have sensible defaults so that a minimal config.yaml (one that only sets paths:) still works — the pipeline will fall back to writing outputs next to the source data and processing all discovered birds.

Sub-section dicts (spectrogram_params, embedding_params, labelling_params, phenotyping_params) are kept as plain dicts so each stage can merge them into its own typed dataclass.

Parameters:

save_path (Path | None)
evsong_source (Path | None)
wseg_metadata (Path | None)
birds (List[str] | None)
songs_per_bird (int | None)
songs_seed (int | None)
copy_locally (bool)
wav_root (Path | None)
spectrogram_params (dict | None)
embedding_params (dict | None)
labelling_params (dict | None)
phenotyping_params (dict | None)
generate_catalog (bool)

birds: List[str] | None

copy_locally: bool

embedding_params: dict | None

classmethod empty()[source]

Return a default PipelineConfig for use when config.yaml lacks a pipeline: block.

Return type:: PipelineConfig

evsong_source: Path | None

classmethod from_dict(d)[source]

Parameters:: d (dict)
Return type:: PipelineConfig

generate_catalog: bool

labelling_params: dict | None

phenotyping_params: dict | None

save_path: Path | None

songs_per_bird: int | None

songs_seed: int | None

spectrogram_params: dict | None

wav_root: Path | None

wseg_metadata: Path | None

class song_phenotyping.tools.project_config.ProjectConfig(local_cache, macaw_root, run_registry, pipeline)[source]

Bases: object

Machine-local path settings loaded from config.yaml.

Parameters:

local_cache (Path)
macaw_root (Path | None)
run_registry (Path)
pipeline (PipelineConfig)

bird_dir(bird_id, experiment=None)[source]

Return the local cache directory for a bird.

Parameters:

bird_id (str) – Bird identifier, e.g. ‘or18or24’.
experiment (str, optional) – Experiment subdirectory name, e.g. ‘evsong test’ or ‘wseg test’. If None, returns local_cache / bird_id directly.

Return type:

Path

Examples

cfg.bird_dir(‘or18or24’, ‘evsong test’) # → /Volumes/Extreme SSD/evsong test/or18or24

classmethod load(config_path=None)[source]

Load config from a YAML file.

If config_path is None, searches upward from cwd for config.yaml, then falls back to config.yaml.example.

Raises FileNotFoundError if neither file is found. Raises ImportError if PyYAML is not installed.

Parameters:: config_path (str | Path | None)
Return type:: ProjectConfig

local_cache: Path

macaw_bird_dir(*parts)[source]

Join parts onto macaw_root, or return None if macaw is not mounted.

Example: cfg.macaw_bird_dir(‘annietaylor’, ‘x-foster’)

Parameters:: parts (str)
Return type:: Path | None

macaw_root: Path | None

pipeline: PipelineConfig

run_registry: Path

Pipeline paths

Central path constants for the song phenotyping pipeline output tree.

All pipeline modules import from here so that directory names can be changed in one place. Use stage_path() to build absolute paths.

Output tree layout

<save_path>/<bird>/
    <run_name>/              ← e.g. "d5dfde49" (SHA256[:8] of config)
        stages/
            01_specs/        ← Stage A  spectrogram HDF5 files
            02_features/     ← Stage B  flattened feature HDF5 files
            03_embeddings/   ← Stage C  UMAP HDF5 + pkl models
            04_labels/       ← Stage D  cluster label HDF5 files
            syllable_database/  ← syllable_features.{csv,h5} + feature_params.json
            05_phenotype/    ← Stage E  detailed phenotype pkl files
        results/
            master_summary.csv
            phenotype_results.csv
            run_config.json
            catalog/         ← HTML song catalogs
            plots/           ← PDFs / images

Each unique combination of computational parameters produces a different <run_name> directory, isolating run artifacts completely. The stages/ subtree contains internal computation artifacts; results/ contains outputs intended for human inspection.

song_phenotyping.tools.pipeline_paths.run_root(bird_root, run_name)[source]

Return <bird_root>/<run_name> as a Path.

Parameters:

bird_root – Root directory for a single bird, e.g. /data/pipeline_runs/or18or24.
run_name (str) – Human-readable or hash-derived run identifier, e.g. "baseline" or "a1b2c3d4".

Return type:

Path

Example

>>> run_root("/data/or18or24", "baseline")
PosixPath('/data/or18or24/baseline')

song_phenotyping.tools.pipeline_paths.run_stage_path(bird_root, run_name, subdir)[source]

Return <bird_root>/<run_name>/<subdir> as a Path.

Parameters:

bird_root – Root directory for a single bird.
run_name (str) – Run identifier (see run_root()).
subdir (str) – One of the stage *_DIR constants, e.g. SPECS_DIR.

Return type:

Path

Example

>>> run_stage_path("/data/or18or24", "baseline", SPECS_DIR)
PosixPath('/data/or18or24/baseline/stages/01_specs')

song_phenotyping.tools.pipeline_paths.stage_path(bird_root, subdir)[source]

Return bird_root / subdir as an absolute Path.

Parameters:

bird_root (str | Path) – Root directory for a single bird, e.g. E:/pipeline_runs/or18or24.
subdir (str) – One of the *_DIR constants defined in this module, or any relative path string.

Return type:

Path

Example

>>> from song_phenotyping.tools.pipeline_paths import stage_path, SPECS_DIR
>>> stage_path("/data/pipeline_runs/or18or24", SPECS_DIR)
PosixPath('/data/pipeline_runs/or18or24/stages/01_specs')

File records

class song_phenotyping.tools.filerecords.FileRecord(metadata_path, audio_path=None, audio_is_local=False, server_audio_path=None, bird=None, day=None, time=None, parse_source=None, extras=<factory>)[source]

Bases: object

Resolved paths and parsed metadata for a single audio recording.

Populated by audio_paths_txt_to_filerecords() or constructed directly. Holds the canonical metadata path, the resolved audio path (local copy or server path), and parsed filename fields (bird, day, time).

Parameters:

metadata_path (Path)
audio_path (Path | None)
audio_is_local (bool)
server_audio_path (Path | None)
bird (str | None)
day (str | None)
time (str | None)
parse_source (str | None)
extras (Dict[str, Any])

audio_is_local: bool = False

audio_path: Path | None = None

basename_key()[source]

A stable ‘bird_day_time’ key if available, else fallback to metadata basename.

Return type:: str

bird: str | None = None

day: str | None = None

extras: Dict[str, Any]

classmethod from_paths(metadata_path, audio_path=None, **kwargs)[source]

Convenience constructor accepting strings.

Parameters:

metadata_path (str)
audio_path (str | None)

metadata_path: Path

parse_source: str | None = None

server_audio_path: Path | None = None

time: str | None = None

to_dict()[source]

Convert to plain dict with string paths (safe for JSON/logs).

Return type:: Dict[str, Any]

song_phenotyping.tools.filerecords.audio_paths_txt_to_filerecords(audio_paths_txt, metadata_ext='.wav.not.mat', bird_subset=None)[source]

Read audio_paths.txt lines of the form bird_id|local_path|server_path and return Dict[bird_id, List[FileRecord]].

Parameters:

audio_paths_txt (str) – Path to the audio_paths.txt file.
metadata_ext (str) – Extension used to derive the metadata filename from an audio path.
bird_subset (list of str, optional) – If given, only records for these bird IDs are returned.

Return type:

Dict[str, List[FileRecord]]