06-16, 11:00–11:40 (Europe/London), Minories
Audio, images, and text already have well established deep learning architectures and processing pipelines proven to yield amazing results. I will introduce the data obtained by recording piano performance in MIDI format as a new and exciting area of research, where many challenges encountered in text, image and audio combine in a single modality. This session is ideal for AI enthusiasts, data scientists with a love for music, and anyone curious about the future of creative machine learning.
In this talk, I explore the fascinating intersection of music theory and data science through the lens of piano performances. Starting with a brief overview of music theory, notation and the dynamics that characterise expressive performances, we set the stage for understanding the complexity of musical expression.
I then delve into how MIDI format can serve as a bridge, capturing many of the nuances of piano performances in a structured data format. Utilising popular Python data-science tools (streamlit, pandas, huggingface), I demonstrate the transformation of these performances into a data structure suitable for analysis and machine learning applications. This section aims to equip attendees with the insights needed to gain intuitive understanding of MIDI data, opening new avenues for research and innovation in music and beyond.
Using what we learned, I’ll try to portray the problem from a numerical point of view, to follow up on the music theory approach. I’ll present statistical analysis of millisecond decisions about when to strike the next note, and how professional pianists manifest different distributions for different styles of classical piano repertoire.
I’ll explore the possibility of training multi-billion parameter models on this data format. I’ll walk you through our experiments trying to use architectures successful in other domains, most notably BERT and VQ-VAE networks. We’ll discuss challenges in designing data processing pipelines, tokenization, normalisation, and quantisation - I’ll review techniques already described in the literature, and present some experimental ideas from projects I’ve been a part of.
The core motivation for this project is a belief that the doors to the mathematical nature of music can be opened with Machine Learning. The amount of available MIDI data is constantly growing, and with the rising popularity of electric pianos, it may soon reach the threshold required for successful training of (very) large deep learning networks (if it hasn't already). We are trying to teach the machines to play piano, but keep in mind - we do not aim for a statistically averaged Chopin generator, but rather we want to explore novel musical landscapes, inaccessible without a cluster of GPUs.
No previous knowledge expected
I have been in love with mathematics, physics, and music since childhood, and I started programming at the age of 15 - I have been fascinated by data science ever since. I'm also a guitar player and a performing chorister, now exploring the possible connections between music and data science.
I am currently a student of Computer Science at the Faculty of Mathematics and Information Science at Warsaw University of Technology. Since August 2023, I have been working as a Data Scientist at Piano for AI, where I combine my passions for music, mathematics, and data science. I develop software for training and evaluating large language models on musical data, among other fascinating projects.