The list of environmental audio datasets has moved to DCASE Datalist
Introduction
This page collects datasets, tools, and services for exploring and analyzing sound — from isolated audio clips and environmental recordings to annotation and augmentation tools. Browse the sections below for whatever fits your audio research or development needs.
Looking for environmental audio datasets specifically? See DCASE Datalist.
Curated Dataset Lists
Collections of datasets across environmental audio, bioacoustics, speech, computer vision, and video, usually maintained by domain experts and communities.
Environmental Audio
Bioacoustics
- Datasets for bioacoustics
- Bioacoustics Datasets list maintained by Justin Salamon
- Bioacoustics Datasets entry in Wikidata
Speech
- Voice datasets list maintained by Jim Schwoebel
- Speech Datasets, ISCA Special Interest Group on Robust Speech Recognition
Computer Vision
Video
- Awesome-Video-Datasets list maintained by Yunhua Zhang
Online Services
Online platforms for isolated sounds, geotagged recordings, and environmental audio — ranging from open community-driven databases to institutionally curated archives, each with its own licensing and access conditions.
Isolated Sounds
- Freesound, isolated sounds, tagged, creative commons
- BBC Sound Effects, isolated sounds, textual description, free for research purposes
- Findsounds, isolated sounds, tagged, mixed licensing
- British Library Sound Archive, isolated sounds and Live recordings, only available for UK universities, restricted licensing
Geotagged Recordings
Environmental Sounds
Source-Specific Libraries
Free sound effect libraries by commercial provider:
Tools
Software tools for audio annotation, management, and augmentation — labeling sound events, managing ecological recordings, synthesizing soundscapes.
Annotation
- Label Studio, open source data labeling platform
- Audacity, audio software with basic annotation capabilities. Use label tracks for the annotations, see more info here.
- Audio Labeler App in Matlab, Audio annotation tool introduced in Matlab version R2018b.
- Audio Annotator, Javascript web interface for annotating audio data.
- ELAN, a linguistic annotation tool to create the textual annotations for audio and video files
Audio Management
- Panako, acoustic fingerprinting system which can be used to synchronize audio streams as well
- Pumilio, a Web-Based Management System for Ecological Recordings
Audio Augmentation
- Scaper, soundscape synthesis and augmentation tool
- muda, annotation-aware musical data augmentation, partly applicable for environmental audio (pitch shifting, time stretching). Documentation
- librosa, see time stretching and pitch shifting effects.
- TSM toolbox, MATLAB implementations of various classical time-scale modification (TSM) algorithm.
Prototypes
- Soundscape, a tool for soundscape annotation
- I-SED, an interactive sound event detector, see [Kim2017]
- BAT, BMAT Annotation Tool, see [Melendez-Catalan2017]
- audio-annotator, Audio-annotator, see [Cartwright2017]