Datasets


Environmental sounds

The list of environmental audio datasets has moved to DCASE Datalist

Introduction

This page collects datasets, tools, and services for exploring and analyzing sound — from isolated audio clips and environmental recordings to annotation and augmentation tools. Browse the sections below for whatever fits your audio research or development needs.

Looking for environmental audio datasets specifically? See DCASE Datalist.

Curated Dataset Lists

Collections of datasets across environmental audio, bioacoustics, speech, computer vision, and video, usually maintained by domain experts and communities.

Environmental Audio

Bioacoustics

Speech

Computer Vision

Video

Online Services

Online platforms for isolated sounds, geotagged recordings, and environmental audio — ranging from open community-driven databases to institutionally curated archives, each with its own licensing and access conditions.

Isolated Sounds

Geotagged Recordings

Environmental Sounds

Source-Specific Libraries

Free sound effect libraries by commercial provider:

Tools

Software tools for audio annotation, management, and augmentation — labeling sound events, managing ecological recordings, synthesizing soundscapes.

Annotation

  • Label Studio, open source data labeling platform
  • Audacity, audio software with basic annotation capabilities. Use label tracks for the annotations, see more info here.
  • Audio Labeler App in Matlab, Audio annotation tool introduced in Matlab version R2018b.
  • Audio Annotator, Javascript web interface for annotating audio data.
  • ELAN, a linguistic annotation tool to create the textual annotations for audio and video files

Audio Management

  • Panako, acoustic fingerprinting system which can be used to synchronize audio streams as well
  • Pumilio, a Web-Based Management System for Ecological Recordings

Audio Augmentation

Prototypes