Audio Data Takes Center Stage
There's an entire subfield of machine learning devoted to textual data (hello, natural language processing), while visual data has fueled the massive growth of computer vision and image-generation applications. Both data types have captivated our collective imagination for months with the rise of AI tools like ChatGPT, Midjourney, and Stable Diffusion.
It's sometimes easy to forget that Audio data is a thriving area of innovation, too, with both researchers and industry players making major strides in the ways we understand, process, and create sound. This week, we turn to the world of audio and music to highlight projects and workflows our authors have recently explored.
- Building a music player of one's own. Aleksandra Ma‘s debut TDS article is a fascinating walkthrough of a fun and original endeavor: her attempt to build a music player dedicated (in part) to AI-generated lo-fi hip hop tracks. We learn a lot along the way about the challenges of working with midi files for model training—and at the end we all get to enjoy some cool, mellow beats.
- Could your next songwriting partner be… ChatGPT? For the past several years, Robert A. Gonsalves has been experimenting with various modalities of creative collaboration between humans and AI. The recent arrival of ChatGPT (you may have heard of it by now) has opened some new possibilities, and in his latest project Robert pushes the tool to provide him with genre-specific chord progressions—and song titles. (And yes, you can listen to the results, too!)
- The complex art of identifying spoken words. "Dealing with audio can complicate any machine learning task," says Dorien Herremans—but the effort can be well worth it, given the rapidly growing footprint of speech-recognition technology. Dorien's step-by-step tutorial invites readers to roll up their proverbial sleeves: follow along to build a neural network in PyTorch by directly feeding it audio files that are then converted into fine-tunable spectrograms.
- Not enough audio data? Augment what you've got. From expensive computational resources to copyright limitations, Max Hilsdorf recognizes the difficulty of getting an audio-data project up and running. He goes on to introduce us to data-augmentation approaches that allow us to make the most of the audio in our possession, and explains why you should add Spotify's Pedalboard library to your toolkit.
Don't tune out just yet—we have a few more excellent reads to recommend this week. They go especially well with some AI-generated lo-fi hip hop (or polka! To each their own).
- As Richmond Alake insists in his latest post, data storytelling is a skill you can (and should) cultivate. The thorough roadmap he introduces is a strong starting point for early-career practitioners.
- Another beginner-friendly guide we're thrilled to share is Hennie de Harder‘s primer on linear programming and the simplex algorithm.
- Louis Chan published a one-stop resource for anyone who'd like to learn about SHAP and how to use it to explain your model's outputs.
- From clean code to solid organization, Jo Stichbury encourages data practitioners to draw on software engineering principles to ensure your collaborative projects move along smoothly.
- If you haven't tinkered with synthetic data yet and would like to give it a try, Zolzaya Luvsandorj‘s concise tutorial proposes several approaches for generating mock tabular data.
- Combining two of his passions—data analysis and long-distance running—barrysmyth is back with an engaging deep dive on the patterns that shape marathon preparation.
If you'd like to support the work we publish, the most direct and effective way is to become a Medium member. We hope you consider it.
Until the next Variable,
TDS Editors