Ingestion (Stage A)
Syllable spectrogram extraction from raw audio and segmentation files (Stage A).
This module handles the first stage of the song phenotyping pipeline: reading
segmentation metadata (evsonganaly .wav.not.mat batch files or WhisperSeg
.wav.not.mat metadata files), locating the corresponding audio, computing
short-time Fourier transform spectrograms for each labelled syllable, and
saving the results to HDF5 files.
Two segmentation formats are supported:
- evsonganaly
Produced by the EvSongAnaly MATLAB package. Audio and metadata (
.wav.not.mat) files are co-located under dated subdirectories; abatch.txt.keepfile lists the valid recordings.- wseg / WhisperSeg
Metadata (
.wav.not.mat) files live in a<bird>/song/hierarchy separate from the audio. Thefnamefield inside each metadata file points to the original audio path.
Public API
filepaths_from_evsonganaly()— discover file paths for evsonganaly birdsfilepaths_from_wseg()— discover file paths for WhisperSeg birdssave_specs_for_evsonganaly_birds()— run Stage A for evsonganaly datasave_specs_for_wseg_birds()— run Stage A for WhisperSeg data
- song_phenotyping.ingestion.copy_audio_and_partner_rec(audio_path, copied_data_dir)[source]
Copy the audio file to copied_data_dir and also find/copy its .rec partner.
An existing local copy is reused if it appears identical to the source (same size and source mtime not newer than local mtime by more than 1 s). If the local copy looks stale or is missing it is overwritten.
Returns
(local_audio_path, local_rec_path). local_rec_path isNonewhen no matching .rec file can be found.
- song_phenotyping.ingestion.create_empty_segmented_data()[source]
Create empty segmented data structure.
- song_phenotyping.ingestion.create_segmented_audio_data(specs, wavs, ts, onsets, offsets, labels, tempos, valid_indices, file_identifier, inst_freq_list=None, group_delay_list=None)[source]
Create organized segmented audio data structure from processing results.
- Args:
specs: List of spectrogram arrays wavs: List of waveform arrays ts: List of time reference arrays onsets: Array of onset times offsets: Array of offset times labels: Array of syllable labels tempos: Dict from tempo_estimates (or None); saved as-is into the HDF5 valid_indices: Indices of successfully processed syllables file_identifier: Base string for generating unique hashes
- Returns:
Dictionary with organized segmented audio data
- song_phenotyping.ingestion.filepaths_from_evsonganaly(wav_directory=None, save_path=None, batch_file_naming='batch.txt.keep', bird_subset=None, copy_locally=False, preferred_subdirs=None)[source]
Discover file paths from evsonganaly
batch.txt.keepfiles.Walks wav_directory recursively, finds
batch.txt.keepfiles, and extracts paired metadata (.wav.not.mat) and audio (.wav) paths for each bird. Bird IDs are detected via a letter–digit pattern applied to the directory path components.When copy_locally is
Truethe function first checks whether a populated local cache already exists under save_path. If so, it uses that directly (no server access needed). Otherwise it scans wav_directory, copies audio and metadata to save_path, and returns the local paths. Existing local files are only overwritten when the source appears to have changed (different size or newer mtime).- Parameters:
wav_directory (str) – Root directory containing dated subdirectories with audio and
.wav.not.matfiles (e.g.or18or24/18-08-2023/).save_path (str, optional) – If provided, bird output subdirectories are created here.
batch_file_naming (str, optional) – Name (or substring) of the batch file. Default is
'batch.txt.keep'.bird_subset (list of str, optional) – Restrict discovery to these bird IDs.
Nonereturns all birds.copy_locally (bool, optional) – If
True, copy audio and metadata files to save_path and use those local paths for all downstream processing. On subsequent runs with the same save_path the local cache is reused automatically. Default isFalse.preferred_subdirs (list of str, optional) – If given, only scan directories whose name matches one of these values.
Nonescans all subdirectories.
- Returns:
metadata_file_paths (dict mapping str to list of str) –
{bird_id: [path_to_not_mat_file, ...]}.audio_file_paths (dict mapping str to list of str) –
{bird_id: [path_to_wav_file, ...]}.
- Return type:
See also
save_specs_for_evsonganaly_birdsRun Stage A using paths returned by this function.
- song_phenotyping.ingestion.filepaths_from_local_cache(save_path, bird_subset=None)[source]
Discover metadata file paths from local cache by reading audio_paths.txt files. Returns LOCAL metadata paths for truly offline operation.
- song_phenotyping.ingestion.filepaths_from_wseg(seg_directory, save_path=None, song_or_call='song', file_ext='.wav.not.mat', bird_subset=None, copy_locally=False, wav_root=None)[source]
Discover WhisperSeg metadata file paths organised by bird ID.
Walks seg_directory recursively, collecting
.wav.not.matmetadata files from subdirectories whose path contains song_or_call. Bird IDs are inferred from the directory two levels above thesong/folder (i.e. the structure<seg_directory>/<bird>/song/*.wav.not.mat).When copy_locally is
Truethe function first checks whether a populated local cache already exists under save_path. If so, it uses that directly (no server access needed). Otherwise it scans seg_directory, copies audio and metadata to save_path, and returns the local paths. Existing local files are only overwritten when the source appears to have changed (different size or newer mtime).- Parameters:
seg_directory (str) – Root directory to scan (e.g.
metadata/). Must follow the layout<seg_directory>/<bird>/song/.save_path (str, optional) – If provided, bird subdirectories are created here. Required when copy_locally is
True.song_or_call (str, optional) – Subdirectory name to match —
'song'(default) or'call'.file_ext (str, optional) – Metadata file extension. Default is
'.wav.not.mat'.bird_subset (list of str, optional) – Restrict discovery to these bird IDs.
Nonereturns all birds.copy_locally (bool, optional) – If
True, copy audio and metadata files to save_path and use those local paths for all downstream processing. On subsequent runs with the same save_path the local cache is reused automatically. Default isFalse.wav_root (str)
- Returns:
metadata_file_paths (dict mapping str to list of str) –
{bird_id: [path_to_metadata_file, ...]}.audio_file_paths (dict mapping str to list of str) –
{bird_id: [path_to_audio_file, ...]}. Populated only when copy_locally isTrue; otherwise an empty list per bird.
- Return type:
See also
save_specs_for_wseg_birdsRun Stage A using paths returned by this function.
- song_phenotyping.ingestion.process_and_save_audio(audio_file_path, output_path, metadata, params, split_syllables=False, verbose=False, save_manual=True)[source]
Process audio file and save segmented data with progress tracking. Updated to use consolidated ProcessingResult.
- song_phenotyping.ingestion.process_pipeline(pipeline_name, settings)[source]
Process a single pipeline (evsonganaly or wseg).
- song_phenotyping.ingestion.process_single_file(metadata_file_path, audio_file_path, save_path, params, read_songpath_from_metadata, verbose, prefer_local=True, run_name='default', save_manual=True, bird_name=None)[source]
Process a single metadata file and save spectrograms if conditions are met.
- Args:
prefer_local: If True, prefer local audio files over server files bird_name: If provided, use this as the bird ID for output path construction
instead of parsing it from the audio filename. Needed when audio files are named after tutor birds rather than the foster bird being processed.
- song_phenotyping.ingestion.reconstruct_server_path(stored_path)[source]
Reconstruct full server path from stored relative path using current platform.
- song_phenotyping.ingestion.resolve_audio_file_path(metadata_file_path, metadata_matfile, read_songpath_from_metadata, bird_folder=None, prefer_local=True)[source]
Resolve the path to the audio file and return offset.
- Args:
bird_folder: Path to bird folder for audio path mapping (optional) prefer_local: If True and bird_folder provided, prefer local files
- Returns:
tuple: (audio_file_path, wseg_offset) or (None, offset) if file not found
- song_phenotyping.ingestion.save_data_specs(candidate_files, save_path, params, verbose=False, read_songpath_from_metadata=True, prefer_local=True, run_name='default', save_manual=True, bird_name=None, max_workers=None)[source]
Process metadata files and save spectrograms to HDF5 files with detailed progress tracking.
- bird_namestr, optional
If provided, use this name for output path construction instead of parsing it from the audio filename. Required for cross-foster data where audio files are named after tutor birds rather than the foster bird being processed.
- max_workersint, optional
Number of parallel worker processes. Defaults to half of available CPU cores (conservative, since each worker does I/O plus FFT computation). Pass 1 to disable parallelism.
- song_phenotyping.ingestion.save_specs_for_evsonganaly_birds(metadata_file_paths, audio_file_paths, save_path=None, songs_per_bird=5, params=None, verbose=False, songs_seed=None, run_name='default')[source]
Run Stage A for evsonganaly birds: extract and save syllable spectrograms.
For each bird in metadata_file_paths, selects up to songs_per_bird unprocessed recordings, computes syllable spectrograms, and saves them as HDF5 files under
<save_path>/<bird>/syllable_data/specs/.- Parameters:
metadata_file_paths (dict mapping str to list of str) –
{bird_id: [path_to_not_mat_file, ...]}, as returned byfilepaths_from_evsonganaly().audio_file_paths (dict mapping str to list of str or None) –
{bird_id: [path_to_wav_file, ...]}. IfNone, audio paths are resolved from the metadata files directly.save_path (str) – Root output directory. Bird subdirectories are created automatically.
songs_per_bird (int or None, optional) – Maximum number of songs to process per bird.
Noneprocesses all available recordings. Default is 5.params (SpectrogramParams, optional) – Spectrogram computation parameters. Defaults to
SpectrogramParams.verbose (bool, optional) – Enable verbose per-file logging. Default is
False.songs_seed (int or None, optional) – Random seed for song subset selection.
None(default) gives non-deterministic selection; set an integer for reproducible subsets.run_name (str)
Notes
Already-processed songs are detected by counting files in the output
specs/directory; only the remaining quota is processed. Re-running is therefore safe and incremental.See also
filepaths_from_evsonganalyDiscover input file paths.
save_specs_for_wseg_birdsEquivalent function for WhisperSeg data.
- song_phenotyping.ingestion.save_specs_for_wseg_birds(metadata_file_paths, audio_file_paths, save_path, songs_per_bird=20, params=None, verbose=False, copy_locally=False, songs_seed=None, run_name='default')[source]
Run Stage A for WhisperSeg birds: extract and save syllable spectrograms.
For each bird in metadata_file_paths, resolves audio paths from the embedded
fnamefield in each.wav.not.matfile, selects up to songs_per_bird unprocessed recordings, computes syllable spectrograms, and saves them as HDF5 files under<save_path>/<bird>/syllable_data/specs/.- Parameters:
metadata_file_paths (dict mapping str to list of str) –
{bird_id: [path_to_not_mat_file, ...]}, as returned byfilepaths_from_wseg().audio_file_paths (dict mapping str to list of str) –
{bird_id: [path_to_wav_file, ...]}. Populated when copy_locally wasTrueinfilepaths_from_wseg(); otherwise pass an empty-list dict and audio is resolved from metadata.save_path (str) – Root output directory. Bird subdirectories are created automatically.
songs_per_bird (int or None, optional) – Maximum number of songs to process per bird.
Noneprocesses all available recordings. Default is 20.params (SpectrogramParams, optional) – Spectrogram computation parameters. Defaults to
SpectrogramParams.verbose (bool, optional) – Enable verbose per-file logging. Default is
False.copy_locally (bool, optional) – If
True, audio_file_paths contains local copies (as populated byfilepaths_from_wseg()withcopy_locally=True) and those paths are used directly. Default isFalse.songs_seed (int or None, optional) – Random seed for song subset selection.
None(default) gives non-deterministic selection; set an integer for reproducible subsets.run_name (str)
Notes
Already-processed songs are detected by counting files in the output
specs/directory; only the remaining quota is processed.See also
filepaths_from_wsegDiscover input file paths.
save_specs_for_evsonganaly_birdsEquivalent function for evsonganaly data.
- song_phenotyping.ingestion.select_new_file_pairs(available_metadata_files, available_audio_files, already_saved_files, needed_count, seed=None)[source]
Return up to needed_count (metadata_path, audio_path) pairs whose base names are not present in already_saved_files. Matching is done by filename stem (filename without extension). If a metadata file has no matching audio file, it is skipped and a warning is logged.
- song_phenotyping.ingestion.select_new_files(available_metadata_files, already_saved_files, needed_count)[source]
Select files that haven’t been processed yet.