song_phenotyping

Getting started

  • Installation
    • Requirements
    • Install from source
    • Optional extras
    • Configuration
      • Step 1 — create your local config
      • Step 2 — set required paths
      • Step 3 — optional pipeline settings
      • Spectrogram parameters (Stage A)
      • Embedding parameters (Stage C)
      • Clustering parameters (Stage D)
      • Phenotyping parameters (Stage E)
    • Macaw server configuration
  • Usage guide
    • Quick start
    • Stage A — Spectrogram ingestion
      • Stage A1 — Fixed-length slicing (optional)
    • Stage B — Flattening
    • Stage C — UMAP embedding
    • Stage D — Clustering and labelling
      • Custom HDBSCAN grid
    • Stage E — Phenotyping
    • Visual inspection — HTML catalogs
    • Configuration

API reference

  • API Reference
    • Ingestion (Stage A)
      • Public API
      • copy_audio_and_partner_rec()
      • create_empty_segmented_data()
      • create_segmented_audio_data()
      • filepaths_from_evsonganaly()
      • filepaths_from_local_cache()
      • filepaths_from_wseg()
      • main()
      • process_and_save_audio()
      • process_pipeline()
      • process_single_file()
      • reconstruct_server_path()
      • resolve_audio_file_path()
      • save_data_specs()
      • save_specs_for_evsonganaly_birds()
      • save_specs_for_wseg_birds()
      • select_new_file_pairs()
      • select_new_files()
      • select_wseg_file_pairs_from_metadata()
      • standardize_bird_band()
    • Flattening (Stage B)
      • Public API
      • create_flattened_output_path()
      • extract_song_id()
      • find_syllable_files()
      • flatten_bird_spectrograms()
      • flatten_spectrograms()
      • load_syllable_data()
      • process_single_syllable_file()
      • save_flattened_data()
    • Embedding (Stage C)
      • Public API
      • UMAPParams
      • calculate_adaptive_workers_improved()
      • calculate_safe_batch_size()
      • check_embedding_compatibility()
      • compare_umap_embeddings_plot()
      • complex_spectrogram_distance()
      • compute_and_save_umap_memory_aware()
      • compute_embedding_grid_parallel_robust()
      • compute_single_umap_worker_safe()
      • estimate_umap_memory_usage()
      • explore_embedding_parameters_robust()
      • generate_embedding_paths()
      • group_delay_distance()
      • inspect_existing_embeddings()
      • instantaneous_freq_distance()
      • load_embedding_from_file()
      • load_flattened_specs()
      • main()
      • monitor_memory_usage()
      • phase_aware_spectrogram_distance()
      • save_umap_embeddings()
      • save_umap_model()
      • subsample_by_song()
      • subsample_data()
    • Labelling (Stage D)
      • Supported quality metrics
      • Public API
      • DEFAULT_HDBSCAN_GRID
      • HDBSCANParams
      • aggregate_raw_scores_across_birds()
      • analyze_parameter_performance_by_sample_size()
      • clear_clustering_outputs()
      • cluster_embeddings()
      • compute_composite_score()
      • compute_cross_bird_composite_scores()
      • compute_metric_ranking()
      • compute_scores()
      • create_cluster_summary_pdf()
      • dunn_index()
      • identify_optimal_parameters_by_sample_size()
      • information_criterion()
      • label_bird()
      • load_labels()
      • load_master_summary()
      • load_umap_embeddings()
      • main()
      • parse_embedding_filename()
      • plot_summary_matrix()
      • plot_umap()
      • remove_directory()
      • reorder_columns()
      • save_cross_bird_analysis()
      • save_labels()
      • save_master_summary()
      • score_cluster_penalty()
      • search_cluster_params()
      • select_best_params()
    • Phenotyping (Stage E)
      • Public API
      • PhenotypingConfig
      • analyze_repeats()
      • analyze_transitions()
      • analyze_vocabulary_and_entropy()
      • calculate_phenotypes_for_label_type()
      • create_unified_phenotype_row()
      • detect_intro_notes()
      • generate_manual_umap_plot()
      • load_bird_syllable_data()
      • load_clustering_labels_for_syllables()
      • load_clustering_results()
      • load_tempo_stats()
      • main()
      • phenotype_bird()
      • plot_repeat_patterns()
      • plot_transition_matrices()
      • plot_vocabulary_comparison()
      • save_detailed_phenotype_data()
    • Catalog (HTML visualization)
      • Public API
      • CatalogConfig
      • generate_all_catalogs()
      • generate_cluster_quality_catalog()
      • generate_sequencing_catalog()
      • generate_song_catalog()
      • generate_syllable_type_catalog()
    • Slicing (Stage A1)
      • Public API
      • main()
      • process_slicing_pipeline()
      • slice_syllable_files_from_evsonganaly()
      • slice_syllable_files_from_wseg()
    • Signal processing
      • ProcessingResult
      • add_tempo_to_h5()
      • backfill_tempo_for_bird()
      • backfill_tempo_for_run()
      • create_output_paths()
      • define_slice_on_off()
      • downsample_spec()
      • extract_compact_phase_features()
      • extract_minimal_phase_feature()
      • extract_phase_features()
      • generate_syllable_hashes()
      • get_memory_usage()
      • get_song_spec()
      • get_song_specs()
      • label_slices()
      • load_and_validate_metadata()
      • pad_waveforms_to_same_length()
      • parse_audio_filename()
      • read_metadata()
      • rms_norm()
      • save_segmented_audio_data()
      • save_spec_slices()
      • setup_logging()
      • split_long_syllables_with_mapping()
      • tempo_estimates()
      • verify_save()
    • Tools
      • Label handling
      • Spectrogram parameters
      • Run configuration and registry
      • Project configuration
      • Pipeline paths
      • File records
  • Ingestion (Stage A)
    • Public API
    • copy_audio_and_partner_rec()
    • create_empty_segmented_data()
    • create_segmented_audio_data()
    • filepaths_from_evsonganaly()
    • filepaths_from_local_cache()
    • filepaths_from_wseg()
    • main()
    • process_and_save_audio()
    • process_pipeline()
    • process_single_file()
    • reconstruct_server_path()
    • resolve_audio_file_path()
    • save_data_specs()
    • save_specs_for_evsonganaly_birds()
    • save_specs_for_wseg_birds()
    • select_new_file_pairs()
    • select_new_files()
    • select_wseg_file_pairs_from_metadata()
    • standardize_bird_band()
  • Flattening (Stage B)
    • Public API
    • create_flattened_output_path()
    • extract_song_id()
    • find_syllable_files()
    • flatten_bird_spectrograms()
    • flatten_spectrograms()
    • load_syllable_data()
    • process_single_syllable_file()
    • save_flattened_data()
  • Embedding (Stage C)
    • Public API
    • UMAPParams
      • UMAPParams.from_dict()
      • UMAPParams.metric
      • UMAPParams.min_dist
      • UMAPParams.n_components
      • UMAPParams.n_epochs
      • UMAPParams.n_neighbors
      • UMAPParams.to_dict()
      • UMAPParams.validate_params()
    • calculate_adaptive_workers_improved()
    • calculate_safe_batch_size()
    • check_embedding_compatibility()
    • compare_umap_embeddings_plot()
    • complex_spectrogram_distance()
    • compute_and_save_umap_memory_aware()
    • compute_embedding_grid_parallel_robust()
    • compute_single_umap_worker_safe()
    • estimate_umap_memory_usage()
    • explore_embedding_parameters_robust()
    • generate_embedding_paths()
    • group_delay_distance()
    • inspect_existing_embeddings()
    • instantaneous_freq_distance()
    • load_embedding_from_file()
    • load_flattened_specs()
    • main()
    • monitor_memory_usage()
    • phase_aware_spectrogram_distance()
    • save_umap_embeddings()
    • save_umap_model()
    • subsample_by_song()
    • subsample_data()
  • Labelling (Stage D)
    • Supported quality metrics
    • Public API
    • DEFAULT_HDBSCAN_GRID
    • HDBSCANParams
      • HDBSCANParams.from_dict()
      • HDBSCANParams.min_cluster_size
      • HDBSCANParams.min_samples
      • HDBSCANParams.to_dict()
    • aggregate_raw_scores_across_birds()
    • analyze_parameter_performance_by_sample_size()
    • clear_clustering_outputs()
    • cluster_embeddings()
    • compute_composite_score()
    • compute_cross_bird_composite_scores()
    • compute_metric_ranking()
    • compute_scores()
    • create_cluster_summary_pdf()
    • dunn_index()
    • identify_optimal_parameters_by_sample_size()
    • information_criterion()
    • label_bird()
    • load_labels()
    • load_master_summary()
    • load_umap_embeddings()
    • main()
    • parse_embedding_filename()
    • plot_summary_matrix()
    • plot_umap()
    • remove_directory()
    • reorder_columns()
    • save_cross_bird_analysis()
    • save_labels()
    • save_master_summary()
    • score_cluster_penalty()
    • search_cluster_params()
    • select_best_params()
  • Phenotyping (Stage E)
    • Public API
    • PhenotypingConfig
      • PhenotypingConfig.adaptive_repeat_factor
      • PhenotypingConfig.dyad_threshold
      • PhenotypingConfig.figure_dpi
      • PhenotypingConfig.generate_plots
      • PhenotypingConfig.heatmap_annotation_size
      • PhenotypingConfig.intro_note_position_threshold
      • PhenotypingConfig.min_syllable_proportion
      • PhenotypingConfig.repeat_candidate_range
      • PhenotypingConfig.repeat_significance_threshold
      • PhenotypingConfig.use_top_n_clusterings
    • analyze_repeats()
    • analyze_transitions()
    • analyze_vocabulary_and_entropy()
    • calculate_phenotypes_for_label_type()
    • create_unified_phenotype_row()
    • detect_intro_notes()
    • generate_manual_umap_plot()
    • load_bird_syllable_data()
    • load_clustering_labels_for_syllables()
    • load_clustering_results()
    • load_tempo_stats()
    • main()
    • phenotype_bird()
    • plot_repeat_patterns()
    • plot_transition_matrices()
    • plot_vocabulary_comparison()
    • save_detailed_phenotype_data()
  • Catalog (HTML visualization)
    • Public API
    • CatalogConfig
      • CatalogConfig.dpi
      • CatalogConfig.grid_cols
      • CatalogConfig.n_per_type
      • CatalogConfig.n_songs
      • CatalogConfig.overwrite
      • CatalogConfig.song_duration
      • CatalogConfig.song_fig_height
      • CatalogConfig.song_fig_width
      • CatalogConfig.syl_fig_size
    • generate_all_catalogs()
    • generate_cluster_quality_catalog()
    • generate_sequencing_catalog()
    • generate_song_catalog()
    • generate_syllable_type_catalog()
  • Slicing (Stage A1)
    • Public API
    • main()
    • process_slicing_pipeline()
    • slice_syllable_files_from_evsonganaly()
    • slice_syllable_files_from_wseg()
  • Tools
    • Label handling
      • Label conventions
      • LabelHandler
      • LabelType
      • has_manual_labels()
    • Spectrogram parameters
      • SpectrogramParams
    • Run configuration and registry
      • RunConfig
      • RunRegistry
    • Project configuration
      • PipelineConfig
      • ProjectConfig
    • Pipeline paths
      • Output tree layout
      • run_root()
      • run_stage_path()
      • stage_path()
    • File records
      • FileRecord
      • audio_paths_txt_to_filerecords()
song_phenotyping
  • Overview: module code

All modules for which code is available

  • song_phenotyping.catalog
  • song_phenotyping.embedding
  • song_phenotyping.flattening
  • song_phenotyping.ingestion
  • song_phenotyping.labelling
  • song_phenotyping.phenotyping
  • song_phenotyping.signal
  • song_phenotyping.slicing
  • song_phenotyping.tools.filerecords
  • song_phenotyping.tools.label_handler
  • song_phenotyping.tools.pipeline_paths
  • song_phenotyping.tools.project_config
  • song_phenotyping.tools.run_config
  • song_phenotyping.tools.spectrogram_configs

© Copyright 2024, Annie Taylor.

Built with Sphinx using a theme provided by Read the Docs.