Music Analysis Pipeline

This project is a music analysis pipeline I developed as part of my work on the CORPUS project. Its purpose is to take incoming music audio and turn it into rich, structured metadata that can support browsing, filtering, moderation, search, and other downstream creative or editorial workflows.

Rather than focusing on one narrow task, the pipeline performs a broad multi-stage reading of a track. It can process folders in batch or work as an API, and it produces a unified metadata view that describes a piece of music from several angles at once.

What it does

The pipeline is designed to analyse musical audio in a way that is useful for real systems. Depending on the track, it can extract and combine:

basic metadata such as title, artist, duration, file type, and sample rate
musical attributes such as key, mode, tempo, and tempo changes
stylistic descriptors such as genres, moods, instruments, and keywords
descriptive text that summarises the overall character of a track
vocal information, including whether vocals are present, lyric transcription, detected language, and singer-profile fields
content safety signals for lyrics and potentially sensitive material
non-music filtering, helping separate genuine music from speech-heavy or irrelevant uploads
cover-song detection signals for identifying likely matches against known material

The result is a single structured representation of a track that can be consumed by other tools and interfaces.

Applications

The interesting part of the project is the fact that all of these views are brought together into one coherent pipeline. That makes it possible to work with music collections in a much richer way than simple file metadata allows.

For example, the system can support workflows such as:

organising and indexing large music catalogues
powering upload-time analysis for music platforms
generating richer search and filter options
preparing tracks for recommendation, discovery, or visualisation layers
supporting moderation and rights-related review workflows
feeding downstream interfaces that need structured musical descriptions

In other words, it turns raw audio into something much more searchable, inspectable, and operationally useful.

Design goals

From the start, I approached this as a production-minded analysis system rather than just a research prototype. That meant designing for:

breadth across many kinds of musical descriptors
speed and batch throughput
reliability across multiple connected analysis stages
observability, so processing can be monitored and diagnosed
integration, both as a folder-based pipeline and as an HTTP API

The pipeline can analyse whole directories of music files, return JSON outputs, and also expose the same capabilities through service endpoints for other tools to call programmatically.

Practical output

What comes out is a fairly complete metadata package for each track. That package can include musical structure, descriptive tags, text summaries, vocal and lyric analysis, moderation signals, and auxiliary matching results.

Context

I developed this project through my involvement with CORPUS, where the focus is on building practical AI and music technology that can plug into larger systems and workflows.