aisfx

Representation Learning for the Automatic Indexing of Sound Effects Libraries (ISMIR 2022): Deep audio embeddings pre-trained on UCS & Non-UCS-compliant datasets.


Keywords
audio, deep, embeddings, learning, machine, pytorch, representation, sound, effects, library, universal, category, system, deep-learning, embedding-models, machine-learning, music-information-retrieval, representation-learning, sound-effects-library, universal-category-system
License
MIT
Install
pip install aisfx==0.1.2

Documentation

aiSFX

Picture

Representation Learning for the Automatic Indexing of Sound Effects Libraries (ISMIR 2022): Deep audio embeddings pre-trained on UCS & Non-UCS-compliant datasets.

This work was inspired by the creation of the Universal Category System (UCS), an industry-proposed public domain initiative initialized by Tim Nielsen, Justin Drury, Kai Paquin, and others. First launching in the fall of 2020, UCS offers a standardized framework for sound effects library metadata designed by and for sound designers and editors.

How To Use

Please refer to this package's documentation for Installation Instructions and Tutorials of how to extract embeddings.

Visualizations of UCS Classes

Click the above to visualize coarse-level "Category" UCS classes in Pro Sound Effects (PSE), Soundly (SDLY), and UCS Mixed (UMIX).

Cite This Work

Please cite the paper below if you use it in your work.

This paper has been accepted at the 23rd International Society for Music Information Retrieval Conference (ISMIR) in Bengaluru, India (December 04-08, 2022). To cite our work, please refer to the following.

[1] Representation Learning for the Automatic Indexing of Sound Effects Libraries

  @inproceedings{ismir_aisfx,
    title={Representation Learning for the Automatic Indexing of Sound Effects Libraries},
    author={Ma, Alison Bernice and Lerch, Alexander},
    booktitle={Proceedings of the 23rd International Society for Music Information Retrieval Conference (ISMIR)},
    year={2022},
    pages={866--875}
  }

Acknowledgements

We would like to thank those who provided the data required to conduct this research as well as those who took the time to share their insights and software licenses for tools regarding sound search, query, and retrieval.

Universal Category System (UCS) • Alex Lane • All You Can Eat Audio • Articulated Sounds • Audio Shade • aXLSound • Big Sound Bank • BaseHead • Bonson • BOOM Library • Frick & Traa • Hzandbits • InspectorJ • Kai Paquin • KEDR Audio • Krotos Audio • Nikola Simikic • Penguin Grenade • Pro Sound Effects • Rick Allen Creative • Sononym • Sound Ideas • Soundly • Soundminer • Storyblocks • Tim Nielsen • Thomas Rex Beverly • ZapSplat

License: Pre-trained Model & Paper

This pre-trained model and paper [1] is made available under a Creative Commons Attribution 4.0 International License (CC BY 4.0).