Flattening (Stage B)

Flatten 2-D syllable spectrograms into 1-D feature vectors (Stage B).

Each syllable spectrogram saved by Stage A is an (n_freq, n_time) array. This module reshapes the full set of spectrograms for one song into a 2-D matrix of shape (n_features, n_syllables) — where n_features = n_freq × n_time — and writes the result to a paired HDF5 file under <bird>/syllable_data/flattened/.

Public API

flatten_bird_spectrograms() — run Stage B for a single bird

song_phenotyping.flattening.create_flattened_output_path(bird_root, song_id, run_name='default')[source]

Build the output path for a flattened HDF5 file, creating the directory.

Parameters:

bird_root (str) – Bird root directory (e.g. <save_path>/<bird>).
song_id (str) – Song identifier (from extract_song_id()).
run_name (str, optional) – Run identifier; output goes under runs/<run_name>/.

Returns:

Full path <bird_root>/runs/<run_name>/stages/02_features/flattened_<song_id>.h5.

Return type:

str

song_phenotyping.flattening.extract_song_id(filepath)[source]

Return the song ID embedded in a syllable HDF5 filename.

Parameters:: filepath (str) – Path to a file named syllables_<song_id>.h5.
Returns:: The <song_id> portion of the filename.
Return type:: str
Raises:: ValueError – If 'syllables_' is not found in the filename stem.

song_phenotyping.flattening.find_syllable_files(syllables_dir)[source]

Return all Stage A HDF5 files in syllables_dir.

Parameters:: syllables_dir (str) – Directory to search (typically <bird>/syllable_data/specs/).
Returns:: Absolute paths to syllables_*.h5 files; empty list if the directory does not exist.
Return type:: list of str

song_phenotyping.flattening.flatten_bird_spectrograms(directory, bird, params=None, run_name='default')[source]

Flatten all Stage A spectrograms for one bird (Stage B entry point).

Reads every syllables_*.h5 file from <directory>/<bird>/syllable_data/specs/, flattens each spectrogram stack from (n_syllables, n_freq, n_time) to (n_freq × n_time, n_syllables), and writes flattened_<song_id>.h5 files to <directory>/<bird>/syllable_data/flattened/.

Already-flattened files are skipped, so re-running is safe.

Parameters:

directory (str) – Project root directory containing bird subdirectories.
bird (str) – Bird identifier (e.g. 'or18or24').
params (SpectrogramParams or None) – Pipeline parameters. When provided, params.duration_feature_weight controls whether a duration block is appended to the feature vector. None uses defaults (duration feature disabled).
run_name (str)

Returns:

True if at least one file was processed successfully (or if there were no files to process); False if all files failed.

Return type:

bool

Examples

>>> from song_phenotyping.flattening import flatten_bird_spectrograms
>>> flatten_bird_spectrograms("/Volumes/Extreme SSD/pipeline_runs", "or18or24")
True