Hi,
I have a poor quality recording of a mono-tonic drum rhythm (a
recording where no two notes occur simultaneously).
I wish to convert this recording to midi and then to use samples (high
quality) to reconstruct the recording.
So far, I have written an attack algorithm which establishes where
each note begins. This works fine as far I can currently tell.
Therefore I now have a midi track of the drum rhythm, but the loudness
of each note is the same.
I have also produced an audio file, 128 seconds long, with a single
note from my drum sample library occuring every 1 second. Each note
rises in amplituded during the recording. This is achieved by
selecting midi loudness values of 0 to 127.
I now wish to measure the power of each note in the origional
recording as well as those in the sample recording. That way, I can
map a note in the origional recording with one in the samples
recording, such that they have the same loudness. This in turn allows
me to establish the midi loudness value for each note.
I wish to establish all this using an fft, which I have also used for
my attack detection algorithm. Would I be correct in saying that it
would be enough to establish the power at the attack for my purposes?
That is, to calculate the power at, and immediately after, the attack
point for each note. And to then match notes in each recording based
on this power value. My fft window is 2048 bins in length with a
44100Hz sampling rate.
Any suggestions?
Thanks in advance,
Barry.
Don Pearce - 23 Jul 2007 18:27 GMT
>Hi,
>
[quoted text clipped - 33 lines]
>
>Barry.
How have you established attack using an FFT? The signal needs to be
in time domain for that - frequency domain tells you nothing about
attack. The same goes for establishing the velocity. You need to
measure the first peak level after your established attack moment -
that will give you a value proportional top velocity.
So forget the FFT - everything you need for this job happens in time
domain.
d

Signature
Pearce Consulting
http://www.pearce.uk.com
bg_ie@yahoo.com - 24 Jul 2007 08:27 GMT
> >Hi,
>
[quoted text clipped - 49 lines]
>
> - Visa citerad text -
I don't agree with your comments. When you find the FFT of a signal
you are not losing any information, unless you only consider the
magnitude spectrum. Having said that, I am only using the Magnitude
spectrum... The algorithm I use to find the attack points uses a
moving fft window, therefore it has both a time and spectral dimention
to it. For each fft power magnitude spectrum calculated, I subtract
the power in each bin with those of the previous fft spectrum. I then
sum the result. This technique is called Spectral Flux Detection -
http://www.dafx.ca/proceedings/papers/p_133.pdf
Greg Locock - 24 Jul 2007 09:33 GMT
bg_ie@yahoo.com wrote in news:1185262034.640313.150230
@r34g2000hsd.googlegroups.com:
> I don't agree with your comments. When you find the FFT of a signal
> you are not losing any information, unless you only consider the
[quoted text clipped - 6 lines]
>
> http://www.dafx.ca/proceedings/papers/p_133.pdf
Gosh that's a jolly posh name for that approach. It is directly
equivalent to subtracting the total V^2 for each of the two frames of
data. Since it works like a moving average then it is a low pass
filtered plot of power vs time.
However... You've got something that works. That is great. I'd look at
frame length vs beat interval, if you can reliably set your frame length
to be less than one interval then you can use the sum of the bins^2 to
represent the power of the signal.
But I think you'd be much better off working in the time domain and
using the height of the highest peak immediately after the start of the
note. Easy to do on a case by case basis, very hard to automate.
Cheers
Greg Locock