I am currently the Machine Learning Lead at Resoniks, alongside a Visiting Researcher position at Tampere University. I previously spent years there as a Postdoctoral Research Fellow, affiliated with the Tampere Institute for Advanced Study (Tampere IAS). My academic work focused on automatic content analysis of environmental audio — specifically acoustic scene understanding and sound event detection.
My academic work sits at the intersection of machine learning, audio signal processing, and computational auditory scene analysis. I build methods that let machines identify and understand sounds in everyday environments — technology that ends up in smart infrastructure, autonomous devices, environmental monitoring, and human-machine interaction.
Research
My research interests center on acoustic scene understanding, sound event detection, and acoustic scene classification, and I've co-authored a fair number of publications on these topics in top conferences and journals. Interested in collaborating on research, engineering solutions, or open audio datasets? Have a look around this page or just drop me an email.
For more information about my research, please visit the following pages:
Key Research Areas
- Deep Learning for Audio: Applying convolutional and recurrent neural networks to time-frequency representations of audio
- Sound Event Detection (SED): Constructing robust models to detect sound events in multisource audio mixtures
- Acoustic Scene Classification (ASC): Classifying environments (e.g., street, park, office) based on their acoustic signatures
- Evaluation Frameworks: Developing evaluation metrics, protocols, and benchmark datasets for audio analysis systems
- Dataset Development: Creating annotated audio datasets and evaluation benchmarks for reproducible research, and releasing them as open datasets
- Real-time Audio Analysis: Implementing efficient systems for on-device and streaming audio recognition
Impact
- Academic Impact: A decade-plus of work on neural networks, deep learning, and audio signal processing has produced over 60 publications and 8K+ citations (h-index 35). By SciVal's count, that puts me 8th worldwide in scholarly output and 20th in field-weighted citation impact for "Neural networks; Deep learning; Audio Signal Processing" (2014–2023).
- Open Data and Community Resources: I've co-authored and released over 20 open-access audio datasets, downloaded more than 200K times and used as benchmarks across research and machine learning competitions.
- Innovation and Patents: A handful of international patents have come out of this work, covering privacy-preserving deep audio representations, location-specific sound scene synthesis, and media event suggestion systems.
-
Evaluation Standards and Tools:
I helped define standard evaluation metrics for sound event detection (SED) and sound event localization and detection (SELD), and helped set up evaluation protocols for acoustic scene classification (ASC). I also maintain a few open-source toolboxes that get regular use in the community:
- sed_eval (83K pip downloads, 143 GitHub stars)
- dcase_util (128K pip downloads, 130 GitHub stars)
- sed_vis (120 GitHub stars)
- Education and Outreach: I co-organized a tutorial at ICASSP 2019 on acoustic scene and event detection — second most attended at the conference, with around 200 participants — and contributed chapters to the book Computational Analysis of Sound Scenes and Events.
- Service to the Research Community: I review regularly for journals and conferences including IEEE TASLP, IEEE JSTSP, ICASSP, WASPAA, EUSIPCO, DCASE, ISMIR, and AES, and have taken on organizational roles within the DCASE Challenge (task coordinator, webmaster) and the DCASE Workshop (proceedings editor, technical chair, publication chair).
Work
I'm currently Machine Learning Lead at Resoniks, where we build an AI-powered acoustic testing system for manufacturing quality control. The product combines acoustic testing with machine learning to give a real-time pass/fail on parts, catching surface and internal structural defects without radiation, chemicals, or manual operator analysis.
I keep a foot in academia as a Visiting Researcher at Tampere University. For two decades before that I worked on machine learning methods for environmental audio as part of the Audio Research Group, specializing in sound event detection, acoustic scene classification, and the datasets and evaluation frameworks needed to benchmark them.
Community Work
I'm an active contributor to the Detection and Classification of Acoustic Scenes and Events (DCASE) research community, which I helped get off the ground and grow into a forum running annual international machine learning evaluation campaigns and workshops with hundreds of participants worldwide. I've been a long-time task coordinator for the acoustic scene classification challenge, and have coordinated several sound event detection tasks too — building open datasets, designing evaluation protocols, implementing baseline systems, and maintaining the DCASE website along the way.
Education
My master's studies were on musical genre classification and related music tasks, which led into musical instrument classification in multi-source environments. From there my focus shifted to Computational Auditory Scene Analysis, particularly sound event detection in complex acoustic settings — the subject of my doctoral thesis on computational analysis of everyday acoustic environments.
Development
Alongside the research, I've spent a lot of time building tools and systems that support sound classification, event detection, and acoustic scene analysis work — much of it for the DCASE (Detection and Classification of Acoustic Scenes and Events) community.
For more information about my development work, please visit the following pages:
Machine Learning for Audio
I've built several open-source tools to support machine learning in audio:
- Evaluation & Data Tools:
sed_evalanddcase_utilstandardize evaluation and dataset handling. - Visualization:
sed_visandjs-datatablemake it easier to present system outputs and annotations. - Tutorials & Examples: hands-on code from tutorials (e.g., ICASSP2019), plus example systems for the book Computational Analysis of Sound Scenes and Events.
- DCASE Baselines: reference systems for acoustic scene classification and sound event detection, in both Python and MATLAB.
Website Development
I also enjoy building websites for academic and community initiatives. The main one is the DCASE Community Website, built on the Pelican static site generator with custom plugins for managing citations, datasets, events, and personnel.
On the recreational side, there's bbStat, a basketball statistics platform for regional leagues in Finland — a custom Joomla component that serves thousands of players and fans each year with game data, player stats, and league standings.
Web Utilities & Plugins
To make academic website development less repetitive, I've built a set of Pelican plugins that generate publication lists, personnel directories, and other structured content straight from YAML and BibTeX, plus smaller ones for tables of contents, file-modification tracking, and recent-article listings. To go with them, I designed custom Bootstrap-based themes that give academic sites a clean, professional look without being a pain to maintain.