Flattening (Stage B)

Flatten 2-D syllable spectrograms into 1-D feature vectors (Stage B).

Each syllable spectrogram saved by Stage A is an (n_freq, n_time) array. This module reshapes the full set of spectrograms for one song into a 2-D matrix of shape (n_features, n_syllables) — where n_features = n_freq × n_time — and writes the result to a paired HDF5 file under <bird>/syllable_data/flattened/.

Public API

song_phenotyping.flattening.create_flattened_output_path(bird_root, song_id, run_name='default')[source]

Build the output path for a flattened HDF5 file, creating the directory.

Parameters:
  • bird_root (str) – Bird root directory (e.g. <save_path>/<bird>).

  • song_id (str) – Song identifier (from extract_song_id()).

  • run_name (str, optional) – Run identifier; output goes under runs/<run_name>/.

Returns:

Full path <bird_root>/runs/<run_name>/stages/02_features/flattened_<song_id>.h5.

Return type:

str

song_phenotyping.flattening.extract_song_id(filepath)[source]

Return the song ID embedded in a syllable HDF5 filename.

Parameters:

filepath (str) – Path to a file named syllables_<song_id>.h5.

Returns:

The <song_id> portion of the filename.

Return type:

str

Raises:

ValueError – If 'syllables_' is not found in the filename stem.

song_phenotyping.flattening.find_syllable_files(syllables_dir)[source]

Return all Stage A HDF5 files in syllables_dir.

Parameters:

syllables_dir (str) – Directory to search (typically <bird>/syllable_data/specs/).

Returns:

Absolute paths to syllables_*.h5 files; empty list if the directory does not exist.

Return type:

list of str

song_phenotyping.flattening.flatten_bird_spectrograms(directory, bird, params=None, run_name='default')[source]

Flatten all Stage A spectrograms for one bird (Stage B entry point).

Reads every syllables_*.h5 file from <directory>/<bird>/syllable_data/specs/, flattens each spectrogram stack from (n_syllables, n_freq, n_time) to (n_freq × n_time, n_syllables), and writes flattened_<song_id>.h5 files to <directory>/<bird>/syllable_data/flattened/.

Already-flattened files are skipped, so re-running is safe.

Parameters:
  • directory (str) – Project root directory containing bird subdirectories.

  • bird (str) – Bird identifier (e.g. 'or18or24').

  • params (SpectrogramParams or None) – Pipeline parameters. When provided, params.duration_feature_weight controls whether a duration block is appended to the feature vector. None uses defaults (duration feature disabled).

  • run_name (str)

Returns:

True if at least one file was processed successfully (or if there were no files to process); False if all files failed.

Return type:

bool

Examples

>>> from song_phenotyping.flattening import flatten_bird_spectrograms
>>> flatten_bird_spectrograms("/Volumes/Extreme SSD/pipeline_runs", "or18or24")
True
song_phenotyping.flattening.flatten_spectrograms(specs, inst_freq=None, group_delay=None, durations=None, duration_feature_weight=0.0)[source]

Reshape a stack of 2-D spectrograms into a feature matrix.

Optionally concatenates instantaneous-frequency and group-delay channels, and a duration token, before transposing to column-per-syllable form.

Parameters:
  • specs (numpy.ndarray, shape (n_syllables, n_freq, n_time)) – Stack of spectrogram arrays.

  • inst_freq (numpy.ndarray or None, shape (n_syllables, n_freq, n_time-1)) – Instantaneous-frequency channel; appended when provided.

  • group_delay (numpy.ndarray or None, shape (n_syllables, n_freq-1, n_time)) – Group-delay channel; appended when provided.

  • durations (numpy.ndarray or None, shape (n_syllables,)) – Normalised syllable durations in [0, 1]. Only used when duration_feature_weight is non-zero.

  • duration_feature_weight (float) – Scale factor applied to the duration block before concatenation. Zero (default) disables the duration feature entirely.

Returns:

Column-per-syllable feature matrix suitable for UMAP input. n_features equals n_freq × n_time plus any optional channels.

Return type:

numpy.ndarray, shape (n_features, n_syllables), dtype float32

Raises:

ValueError – If specs is empty or not 3-D.

song_phenotyping.flattening.load_syllable_data(filepath)[source]

Load spectrograms and metadata from a Stage A HDF5 file.

Parameters:

filepath (str) – Path to syllables_<song_id>.h5 as written by Stage A.

Returns:

  • specs (numpy.ndarray, shape (n_syllables, n_freq, n_time)) – Raw spectrogram array.

  • labels (numpy.ndarray) – Syllable label for each entry.

  • position_idxs (numpy.ndarray) – Position indices within the original recording.

  • hashes (numpy.ndarray) – Unique hash per syllable for cross-stage tracking.

  • durations (numpy.ndarray or None) – Syllable durations in seconds; None if the node is absent.

  • inst_freq (numpy.ndarray or None) – Instantaneous-frequency array, shape (n_syllables, n_freq, n_time-1); None if the node is absent.

  • group_delay (numpy.ndarray or None) – Group-delay array, shape (n_syllables, n_freq-1, n_time); None if the node is absent.

Raises:
  • ValueError – If required HDF5 nodes are missing or array lengths are inconsistent.

  • OSError – If the file cannot be opened.

Return type:

tuple

song_phenotyping.flattening.process_single_syllable_file(filepath, bird_root, duration_feature_weight=0.0, run_name='default')[source]

Flatten one Stage A HDF5 file and write the result.

Skips files whose output already exists.

Parameters:
  • filepath (str) – Path to syllables_<song_id>.h5.

  • bird_root (str) – Bird root directory (e.g. <save_path>/<bird>).

  • duration_feature_weight (float) – Forwarded to flatten_spectrograms(). Zero disables the duration feature block (default).

  • run_name (str)

Returns:

True on success or if the output already existed; False if an error occurred.

Return type:

bool

song_phenotyping.flattening.save_flattened_data(output_path, flattened_specs, labels, position_idxs, hashes, durations=None)[source]

Write flattened spectrograms and metadata to an HDF5 file.

Parameters:
  • output_path (str) – Destination file path (flattened_<song_id>.h5).

  • flattened_specs (numpy.ndarray, shape (n_features, n_syllables)) – Column-per-syllable feature matrix.

  • labels (numpy.ndarray) – Syllable labels (same length as number of columns).

  • position_idxs (numpy.ndarray) – Position indices within the original recording.

  • hashes (numpy.ndarray) – Per-syllable hash values.

  • durations (numpy.ndarray or None) – Syllable durations in seconds; written when provided.

Raises:

Exception – Re-raises any PyTables error after logging it.

Return type:

None