Footsteps Audio Dataset

50 hours of real footstep audio recordings

Check samples on Kaggle

Summary

50 hours of real footstep audio recordings for training sound event detection, activity recognition, and acoustic biometrics models. Manually verified files captured in natural indoor and outdoor environments, with detailed metadata for every recording: surface type, footwear, location, and background noise level. Larger than any publicly available footstep audio dataset, ready for commercial use under a clean license

Introduction

This dataset contains 50 hours of real-world footstep audio recorded in natural conditions: indoors and outdoors, across different surfaces, footwear, and noise environments. Every file was manually verified: each recording contains clearly audible footstep sounds, with no synthetic audio, no augmentation, and no AI-generated content

Each file ships with structured metadata describing the recording context, making this dataset directly usable for supervised learning across footstep detection, sound event classification, acoustic person identification, and walking surface recognition tasks

Dataset Features

Scale & Quality

50 hours of footstep audio recordings
Manually verified files – every recording reviewed for clear footstep audibility
Real-world field recordings – no synthetic audio, no augmentation
Captured indoors and outdoors in natural conditions

Audio Specifications

WAV files + M4A files
Sample rate: 48 kHz (majority), with 44.1 kHz and 16 kHz subsets
Mono and stereo recordings
Recorded primarily on smartphones, with additional laptop and tablet captures

Metadata for Every File

Surface type: wood/laminate, tile, carpet, concrete/asphalt, stairs, other
Footwear: barefoot, slippers, sandals, sneakers, dress shoes/boots, other
Location: indoor / outdoor
Background noise level: low / medium / high
Recording device class: smartphone / laptop / tablet

Use cases and applications

Footstep detection in smart home, security, and IoT systems
Sound event detection (SED) models that include footsteps as a target class
Acoustic person identification – biometric models that recognize individuals by their walking sound
Walking surface classification – distinguishing footsteps on different floor materials
Foley generation – training data for AI sound design models targeting walking sequences

Why this dataset solves real production challenges

Largest footstep audio dataset available commercially. Publicly available academic datasets cap at 14 hours (AFPID-II) or fewer than 1,000 samples (FSD50K, ESC-50). At 50 hours of curated recordings, this dataset replaces months of in-house data collection
Manually verified, not scraped. Every file was reviewed by an annotator to confirm footsteps are clearly audible. No YouTube extracts, no synthetic generation, no contaminated samples
Structured metadata across four dimensions. Surface, footwear, location, and noise are encoded per file, supporting both filtered training and multi-task learning setups

Sample dataset

A sample version of this dataset is available on Kaggle and HuggingFace. Leave a request in the form below for additional samples or the full version

Have a question?

How many footstep recordings are in this dataset?

The full dataset contains 50 hours. Each file is 10 to 100 seconds long. By volume of curated footstep audio, this is the largest commercially available dataset in the category, 3–5× larger than the most cited academic alternatives (AFPILD with 10 hours, AFPID-II with 14 hours). The full version is licensed for commercial training of production ML models

What metadata is provided for each recording?

Every file ships with four structured metadata fields: surface type (wood/laminate, tile, carpet, concrete/asphalt, stairs, other), footwear (barefoot, slippers, sandals, sneakers, dress shoes/boots, other), location (indoor or outdoor), and background noise level (low, medium, high). This supports filtered training, multi-task learning, and stratified evaluation splits across walking conditions

Can I use this dataset for footstep detection model training?

Yes, this dataset is designed specifically for supervised training of footstep detection, sound event detection, and audio classification models. The 50 hours of verified positive samples cover the most common deployment surfaces and footwear types, with realistic background noise variation

Is this dataset suitable for acoustic person identification (footstep biometrics)?

Yes. Acoustic person identification is a recognized use case for this data. Compared to academic benchmarks like AFPILD (40 subjects) and AFPID-II (41 subjects), our dataset offers a different angle: broader surface and footwear coverage per recording, which lets you train models robust to environmental variation

What audio formats and sample rates are included?

Files in WAV format and in M4A. The majority of recordings are 48 kHz, with smaller subsets at 44.1 kHz and 16 kHz. This variation matches real-world deployment conditions, smart home microphones, phone-recorded audio, and embedded device captures all sample at different rates, so models trained on this distribution generalize better than those trained on a single fixed rate

Can I get a sample before purchasing?

Yes. A sample subset is freely available on our Kaggle and HuggingFace pages, you can download and explore the audio quality and metadata format directly. For an extended evaluation sample with specific surface or footwear conditions, leave a request through the form below

Contact us

Tell us about yourself, and get access to free samples of the dataset

I want to receive communications on the newly added datasets

Didn't find what you were looking for?

Our collection includes many datasets for various requests

iBeta

Footsteps Audio Dataset

Footsteps Audio Dataset

Summary

Introduction

Dataset Features

Use cases and applications

Why this dataset solves real production challenges

Sample dataset

Have a question?

Contact us

Didn't find what you were looking for?

iBeta Level 1 Dataset

iBeta Level 2 Dataset

iBeta Level 3 Dataset

Display Replay Dataset for Liveness Detection

Contacts

Company

Datasets

Follow us