Audio Un-mixing and compression

I've been playing with "music technology" for a long time. From the BEEP command on a ZX Spectrum, through to early sequencers on the Commodore Amiga ("trackers", MED and Octamed - which somehow magically turned its 4 tracks of CD-quality audio samples into eight tracks, and a dodgy copy of MusicX), synths and effects (software and hardware), sequencers on Windows PCs and Macs, and 'hardware' sequencers (well, one - but I spent a lot of time with the MPC 2000 and a ZIP drive) - I've spent countless hours playing around with all of them.

Sometimes, you realise that there are things that you just can't do with a home audio setup. Sometimes, you learn that things you assumed were done on expensive studio hardware, you later discover were just done by some kid in their bedroom, using exactly the same computer as you.

Sometimes, you learn that certain things are simply impossible.

For example; if you've got a mixed track - a recording of drums, vocals, guitar, bass etc. - then you can't really separate the instruments out and listen to each instrument in isolation.

There are a few exceptions. For example, if you've got a hard-panned stereo record (eg. lots of early 1960s recordings - like the Beatles songs that have guitars in one speaker and bass in the other), then you can just listen to one earphone to cut out certain instruments.

There's also something called the 'OOPS' or 'Out Of Phase Stereo' effect; if an instrument is exactly in the middle of a stereo recording, you can hook up a speaker to the positive terminal of each channel and - if you're lucky - the centred sound cancels itself out, letting you just hear everything else.¹

About a decade or so ago, there was a fad in the gaming industry for "rhythm games"- games like Guitar Hero and Rock Band would have "official" recordings you'd play along to on a plastic guitar, but if you missed a note then the guitar would cut out. To make that work, they would build a new 4-track mix from the original multitrack recordings - which would then invariably leak online. Which meant if you wanted to listen to, say just the drums from Love Spreads, or just the guitar - then you could.².

There are various other isolated tracks circulating online, leaked from the master recordings or bootlegs. A personal favourite of mine is Eric Clapton's isolated guitar from While My Guitar Gently Weeps³. And for years, there have been people doing really cool things with these kind of recordings - my personal favourite is Girl Talk's "All Day" which, for me, is the absolute peak of its genre.⁴

But these are all special cases, where specific recordings become available. In general, once different sounds have been mixed together, you can't just "extract" the bits you want. The analogy is like cake mix - once you've beaten the eggs and mixed in flour and sugar, you simply can't take out any of the individual ingredients from the batter.

Except… that isn't true any more. I mean - I think its still true for batter; you can't take out the eggs once you've beaten them in with the flour and butter - but for mixed audio, you can pull out specific elements and either listen to them in isolation, or listen to the rest of the music without it.

Want to make yourself a kareoke version of your favourite record to sing along to? Thats now a function in Apple Music - just turn down the vocals.

Want to make yourself a backing track of your favourite record with just the vocals, drums and bass, so you can play along and pretend to be Jimi Hendrix/Eric Clapton/John Squire? Just load the song into Logic Pro and use the Stem Splitter tu turn the original audio track into four separate tracks that you can mute, fade, EQ or whatever. Want to pull the bass and drums out of something where the whole band was recorded through the same microphone? Go for it…

Compression Artefacts

The one caveat is that lots of isolated tracks (especially vocals - or at least, most noticable in vocal tracks) have a weird, 'ghostly' feel to them that seemed characteristic of this AI-driven sound isolation where it doesn't quite work properly.

Or so I thought - what I think I've come to realise is that what is actually going on is an artefact of an effect called compression. Compression is where the volume gets 'flattened' down at the peaks, and/or boosted in the quieter parts.

There are various ways that compression can happen (radio broadcasts apply it, cheap speakers impose it…), but one of the easiest to understand is "optical compression"; there's a light hooked up to the sound source that shines brighter when the source is louder, and a light sensor hooked up to the output that turns the volume down when the light is shining. Play quietly and you'll hear the sound; play a loud note and the volume gets turned down - but only while you're playing the loud notes. As soon as the note fades, the volume comes back up again.

Another way compression can be applied is "side-chaining", where the compression takes its input from one track's volume and then applies it to a different track. If you listen to Daft Punk's One More Time and pay attention to the horn sound, you'll hear that it has a kind of 'pumping' effect where the volume seems to come down in time with the beat. What I think is going on here is a sidechain compression; the kick drum plays on the beat, and that applies the compression to the horn, making that 'pumping' effect (and also making the kick drum on the beat sound more prominent in the mix.)

So, what sounds like an "AI artefact" from these AI-isolated tracks is - I think - actually a compression artefact that is revelaed by the isolation; the track has been compressed (probably as part of compression to the whole track), and that compression is something you very rarely hear without hearing the rest of the track. So - the drums cause a volume spike, the compression brings the volume of everything down a little for that spike, and what you hear in the isolated vocals is a weird "shadow" of the drum sound. Not because the AI isn't doing a good enough job, but because its doing something that we rarely got to hear before.

(And no, no chatbots are required…)

Of course, this probably isn't good for your speakers - these days, its probably much easier to get the same effect in software. ↩
What you can hear from that pretty clearly is that there are two separate guitar tracks switching in and out of the 'guitar' track - actually, switching between the 'guitar' and 'backing' tracks. ↩
Yes, its Eric Clapton. The whole story behind it is fascinating - probably the subject of its own blog post. ↩
To the extent that I can't really listen to other mashups without comparing them, and they never stand up. Its a bit like how The Watchmen ruined all other graphic novels for me. ↩

Info

"Just" an LLM

Is ChatGPT really AI? Or is it just a chatbot?