I finally worked out a really good method for getting some very nice, representative data for the "main" frequencies in music. It was actually pretty easy - basically, taking the difference between the level of a particular frequency and the RMS of all frequencies on a particular frame really makes the peaks jump out (with some logarithms and scales to smooth things out). I was previously using a rolling average of each frequency band, and while it worked "under laboratory conditions," it was just unreliable.
One slight annoyance is that at 44.1kHz, smallish FFT window sizes (say, 1024 samples) mean that each frequency band is 22Hz in width, which is fine for high frequencies, but is much less than the difference between adjacent notes at low frequencies. Equally unfortunately, you get frequency data all the way out to 22kHz, when most musically interesting stops around 4kHz.
So, I'm using an FFT window size of 4096 and taking the lowest 256 samples, which gets me fairly meaningful data out to 5.5kHz, while expanding out the lower frequencies.
The next interesting thing is that different styles of music and mixing can apparently give wildly different visual results:
Volbeat - Heaven Nor Hell
- Kick / Snare / Bass / Guitar all get kind of muddled into the same narrow low frequency bands
- Drum-only sections really jump out though
- Singer's voice shows up amazingly well - interesting harmonic ringing as well (why?)
- Harmonica creates some amazing high frequency sections that wobble back and forth :)
- Orchestras apparently cover a HUGE frequency range with meaningful music frequencies.
- There are peaks across the entire spectrum that clearly correspond with the music.
- Loud and quiet passages are also clearly visible.
- Interestingly, there is almost nothing below ~80Hz, an area clogged in
- The instruments give amazingly clear individual notes
- Occasionally her voice jumps out, but most of the time it gets lost in the other instruments
- Sometimes, though, I can see two parallel frequency bands corresponding with vocal harmony parts, which is cool :)
- There's actually not much OTHER than peaks here, very clean and quiet frequency map.
- Kick / Snare / Guitar / Bass are just kind of jumbled into the low bands
- The artificial harmonics really stand out...
- But there are actually a huge number of peaks that jump out semi-randomly across the entire spectrum. They kind of correspond with the lead guitar, and maybe cymbal / hi-hat hits.
- Unfortunately, the lead guitar kind of gets lost among the noise.
It will probably take a bit of mixing experimentation to figure out what works well here. Actually, I could probably consider dropping everything below 60-80Hz from the RMS calculation... will need to investigate that...