Music reverse-engineering, or, how to pretend that SIDs are mods

Posted by HEx 2013-10-03 at 13:00

One of the goals of the Fooble project is to make music more hacker-friendly; in particular, to expose the internal workings of music whose construction was previously opaque.

The most common way for music to be distributed is as recordings. But reverse-engineering music from a recording is hard. (It's hard for humans. Getting machines to do it is even harder.) Luckily, there are many music formats intermediate in structure between a raw waveform (which is non-trivial to extract information from) and human-editable formats such as Protracker and MIDI (which already have all the information present in a usable form).

One obvious example is the Commodore 64's SID format. As a file format, SID long postdates the C64 itself. It consists of a header followed by 6502 machine code: when executed in the correct environment, this code writes to the memory-mapped I/O registers that control the SID chip itself. Can we turn this into something resembling readable pattern data? Presumably the patterns are stored in the code somehow. But many different playroutines have been written over the years, and given the activity of the C64 demoscene it seems likely that new ones will continue to be written.1 The only reliable way to recover the information we seek is to treat the code as a black box: execute it, watch what it does, and reconstruct what we can.2

So essentially we are faced with a compression problem. We have a large (indeed, potentially infinite) stream of data, of very low entropy, which we would like to turn into a smaller amount of high-entropy data (patterns, instruments, maybe samples). The problems we face are many. There is no indication where pattern boundaries should lie, or how long the tune is. But happily, the SID offers hardware ADSR volume envelopes: if a tune takes advantage of this we can at least identify where notes begin.

Enough waffling. Demo time!

(Enterprising readers with a copy of HVSC should have little problem figuring out how to try their favourite tunes.)

The current code lacks many niceties. No cycle counting is done. The CPU is assumed to be infinitely fast, thus events occur precisely at interrupts. Tunes are assumed to set the interrupt frequency only at initialization time: this sets the tempo for whole tune. Only PSIDs are loaded. A trivial environment is provided that doesn't resemble the C64 much at all. (Many of these problems can be solved at a stroke by using a real C64 emulator. Happily, a javascript port of VICE already exists.)

Also, because modplayjs is currently completely sample-based, much of the SID chip is unemulated. In particular, no filters or ringmod. Variable pulse width is done ickily, by switching between samples. There may be envelope bugs. The problem of finding pattern boundaries has yet to be tackled.

Still, there is good news. Most SIDs play recognizably. Many play reasonably well, modulo the lack of filters. And some simple SIDs have extracted pattern data that resembles what a human would produce. To the best of my knowledge, this is not an approach that anyone has tried before. But as far as I'm concerned, it's certainly a step in the right direction.


[1] It also seems likely that demosceners' playroutines will continue to tend towards the completely undecipherable. Yes, that is 373 bytes of code. (And no, it doesn't render well in modplayjs. Yet.)

[2] That's not to say we can't peek at the code. But it won't necessarily yield good results. In particular, one not-quite-black-box technique I've had only limited success with is to detect looping of the tune by checksumming the state of the emulated machine at each frame. Since the state completely determines future states, matching checksums means a guaranteed loop. This can even be done in constant space.