how perceptually relevant is phase in the harmonics of musical tones?
|
|
Thread rating:  |
robert bristow-johnson - 18 Jun 2005 05:27 GMT i just got the latest issue of the JAES and i'm a little bit disgusted. so i thought i might ask for a straw poll.
given a quasi-periodic musical tone:
N x(t) = SUM{ r_n(t) * cos(n*w0*t + phi_n(t)) } n=1
and a synthesized approximation to it:
N y(t) = SUM{ r_n(t) * cos(n*w0*t + theta_n(t)) } n=1
assuming that you are not listening to the two tones simultaneously, how perceptually relevant do you think that differences between phi_n(t) and theta_n(t) are?
"quasi-periodic" means that r_n(t) (which is >= 0) and phi_n(t) and theta_n(t) are all slowly varying functions of time. all are bandlimited to much less than w0. stated differently, |(d/dt)r_n(t)| << w0*r_n(t),
|(d/dt)phi_n(t)| << w0, and |(d/dt)theta_n(t)| << w0 . just asking for off the cuff judgements. expert judgements (that are supported) are welcome, too.
 Signature r b-j rbj@audioimagination.com
"Imagination is more important than knowledge."
Ethan Winer - 18 Jun 2005 15:40 GMT Robert,
> given a quasi-periodic musical tone: < I'm not a math guy, but in my experience phase is not important. I've played with fairly extreme amounts of phase shift on various musical sources, and the only time I find it audible is while it is changing. Here's an article you may find interesting:
www.ethanwiner.com/phase.html
--Ethan
robert bristow-johnson - 18 Jun 2005 17:04 GMT >> given a quasi-periodic musical tone: < > > I'm not a math guy, but in my experience phase is not important. I've played > with fairly extreme amounts of phase shift on various musical sources, and > the only time I find it audible is while it is changing. that's important. changing phase is detuning the harmonic frequency from its harmonic value. that's half of the point i've been making about the importance of preserving phase when possible. the other half is that non-linear stages after the synthesis will treat "peaky" signals differently than non-peaky signals.
> Here's an article you may find interesting: > > http://www.ethanwiner.com/phase.html i took a look at it.
here is an extreme case that they may not have tried, Ethan. what if you pass music (either polyphonic or a single note) through an all-pass filter where the internal delay element is, say, a second or half-second long (and the feedback factor, say, about 0.9 or 0.95). i would find that very difficult to believe that people would not hear the difference.
for the math guys, this is what that APF would look like (you need to view with a monospaced font):
.-------->[-p]--------->(+)------> y(t) | ^ | | x(t) ----->(+)----'--->[delay of T]---->---| ^ | | | '-------------[p]<-------------'
H(s) = Y(s)/X(s) = (e^(-sT) - p)/(1 - p*e^(-sT))
this filter changes nothing but phase. if T = 0.5 second and p = 0.95, i'll bet anything people will hear its effect.
 Signature r b-j rbj@audioimagination.com
"Imagination is more important than knowledge."
Ethan Winer - 19 Jun 2005 15:00 GMT Robert,
> non-linear stages after the synthesis will treat "peaky" signals differently than non-peaky signals. <
The key here is "nonlinear." As soon as you introduce gross distortion into the signal path, then all bets are off as to what else besides the distortion might be audible. For example, there's the Orban radio processor device (sorry, I can't recall the exact model name) that shifts phase enough to "rotate" the waveform. This reduces peaks on typical male voices, getting more average signal level to the transmitter without distortion or the need for severe compression/limiting. The phase shift itself is not audible, but distortion at the transmitter sure would be!
> here is an extreme case that they may not have tried < And now the word "extreme" is the key. Okay, sure, if you add so much phase shift that some of the music is delayed a full second that's audible. Arny Krueger has such examples at his pcabx.com web site. But that's unrelated to anything you'll encounter in a "normal" circuit. My focus is always on the practical, and in that sense phase shift is simply not a problem. However, the *frequency response errors* caused by phase shift are another matter entirely.
> "Imagination is more important than knowledge." < Except when assessing the audibility of phase shift! :->)
--Ethan
robert bristow-johnson - 19 Jun 2005 19:28 GMT >> non-linear stages after the synthesis will treat "peaky" signals >> differently than non-peaky signals. < > > The key here is "nonlinear." As soon as you introduce gross distortion into > the signal path, then all bets are off well, the issue (an ongoing disagreement between Andrew Horner and myself) is about whether or not phase is a parameter where some attempt be made to preserve it or not, _in_the_context_of_musical_synthesis_. these synthesized musical notes are coming out of amplifiers and loudspeakers. i don't know of many without some non-linearity. so the question is under those conditions, how much phase error is noticeable?
>> here is an extreme case that they may not have tried < > > And now the word "extreme" is the key. Okay, sure, if you add so much phase > shift that some of the music is delayed a full second that's audible. so at what point is it not audible? 0.1 second? 0.01? this is the kernel of the question that i was asking. to begin a paper with just a blanket statement that phase differences are inaudible so we can manipulate phase to be whatever we want and the listeners won't hear it, is wrong under some different conditions. what qualifiers are needed to make such a statement accurate?
 Signature r b-j rbj@audioimagination.com
"Imagination is more important than knowledge."
Ethan Winer - 20 Jun 2005 19:12 GMT Robert,
> whether or not phase is a parameter where some attempt be made to preserve it or not, _in_the_context_of_musical_synthesis_. <
In the context of designing a synthesizer - additive, I assume? - I'd think phase is even less important than for normal sound reproduction. After all, if you're creating sounds synthetically, then how is "preserve" even a factor? What's to preserve?
Since you're convinced this is important, why not just add a phase knob (or control if software) to your synthesizer, and let your users adjust it if they want to.
--Ethan
bert stoltenborg - 20 Jun 2005 23:28 GMT correct me when i"m wrong, but a phase knob or eq on a synth sound behaves as a minimum phase device. I thought we were speaking about fe phase distortion in a x-over between drivers, which behaves different (influences transients in the x-over area) , isn't it, Ethan?
bert
robert bristow-johnson - 21 Jun 2005 05:01 GMT > correct me when i"m wrong, but a phase knob or eq on a synth sound > behaves as a minimum phase device. > I thought we were speaking about fe phase distortion in a x-over > between drivers, which behaves different (influences transients in the > x-over area) , isn't it, Ethan? i'm not Ethan, but this isn't exactly what i meant to be bringing up. i sure hope that no one here ignores phase values for different drivers in a loudspeaker or between different loudspeakers in a stereo or multichannel system. i think everybody here agrees that applying some all-pass filter to your left speaker but not to your right will change something that we can hear.
my issue is about whether we can hear the difference between two harmonic or quasi-periodic musical notes that are played one right after another and are identical except the later has different slowly varying functions of time for the phase component of all of the harmonics than for the former. even though Ghost complained about this representation, the two tones are:
N x(t) = SUM{ r_n(t) * cos(n*w0*t + phi_n(t)) } n=1
and
N y(t) = SUM{ r_n(t) * cos(n*w0*t + theta_n(t)) } n=1
w0 is the fundamental frequency of the tone, w0 would be 2*pi*440 Hz if the note is the A just above middle C.
r_n(t) ( >= 0 ) is the time-variant magnitude envelope for the nth harmonic for both x(t) and y(t). phi_n(t) is the time-variant phase for the nth harmonic for x(t), theta_n(t) is the time-variant phase for the nth harmonic for y(t). all are slowly moving functions of time (i.e. they are bandlimited to much less than w0).
x(t) is identical to y(t) *except* that the phase functions for the harmonics, phi_n(t) and theta_n(t) are not equal to each other (but they are still both slowly moving functions of time, otherwise x(t) and y(t) would not be nearly periodic). the magnitude of each harmonic, r_n(t), is the same for x(t) and y(t).
i think what Ethan is saying (please correct me if i am misinterpreting, Ethan) is that theta_n(t) can be different than phi_n(t) and we will not hear that difference. (assuming that theta_n and phi_n are approximately equally bandlimited.)
in article ZoSdnUaS55qcmirfRVn-sQ@giganews.com, Ethan Winer at ethanw at ethanwiner dot com wrote on 06/20/2005 14:12:
>> whether or not phase is a parameter where some attempt be made to preserve >> it or not, _in_the_context_of_musical_synthesis_. < > > In the context of designing a synthesizer - additive, I assume? pretty close. it's wavetable synthesis (not to be confused with the PCM sampling that E-mu kwyboards or the zillion of "wavetable" soundcard chips do) or sometimes called "group additive synthesis" in the lit. the restriction that wavetable synthesis has that additive synthesis does not is that all of the partials or overtones of wavetable synthesis must be very nearly harmonic, very nearly an integer multiple of some common fundamental whereas additive synthesis can have non-harmonic partials. wavetable synthesis might not work so good for bells and such (except if you could group the partials into two or three harmonic groups, then two or three simultaneous wavetable synths could do such a bell). for normal wavetable synthesis, the deviation of a partial's instantaneous frequency from the exact harmonic value is equal to the time derivative of the phase function (that's another reason i am unhappy that someone would publish a zillion papers all beginning with the assumption that phase just does not matter).
> - I'd think > phase is even less important than for normal sound reproduction. After all, > if you're creating sounds synthetically, then how is "preserve" even a > factor? What's to preserve? the promise of additive synthesis was that it could realistically recreate an arbitrary sound because it would accurately (to some finite degree) recreate each sinusoidal partial component of that given sound. in the case of many musical tones (after the attack transient) it is known that the partials are all _pretty_much_ harmonic (not perfectly harmonic, e.g. the piano has its higher harmonics progressively sharper and sharper). if the partials are nearly exactly harmonic, wavetable synthesis is sufficient (and computationally cheaper) and the small deviations of some partial from its exact harmonic value can be dealt with by slowly changing the phase. but if you throw the phase away to start with, it's kinda hard to do that.
> Since you're convinced this is important, why not just add a phase knob (or > control if software) to your synthesizer, and let your users adjust it if > they want to. how about N phase knobs, one for each harmonic?
my issue is, we got here a synthesis method. sure the synthesizer could be used solely to generate cool, but unnatural sounds, and that's fine. anything goes w.r.t. phase. but often, a measure of merit of a synthesis method is how realistically it can generate a sound of some existing (and usually acoustic) musical instrument. a good example of such is the "physical modeling synthesis".
wavetable synthesis is cheap. much cheaper than physical modeling or additive synthesis. a little more expensive than 2-operator Chowning FM synthesis. but even if it's cheap, wavetable synthesis can generate waveforms that match the waveforms generated by additive synthesis *but* *only* for the case where all of the partials (or those with any decent amount of energy) have frequencies that are very nearly harmonic, that have frequencies that are very nearly integer multiples of some common fundamental frequency.
so for this class of musical tones or notes, we try to use wavetable synthesis to (re)create some tones, there will be some data reduction for the representation of those slowly moving envelopes and phase functions (just as there would be for additive synthesis), and we wish to minimize or at least constrain the effect of that data reduction on how (bad) the note sounds. one way to reduce the data is to simply toss it all away. if we toss away the envelopes r_n(t) (set them to zero or to some totally unrelated functions of time), that would definitely have a noticeable effect on the sound of the tone, so we wouldn't do that!
but it is a lot easier for some to do the same to the phase (with no justification offered) and i'm still wondering how free we are to do that if we're trying to persuasively emulate another musical instrument.
 Signature r b-j rbj@audioimagination.com
"Imagination is more important than knowledge."
Ethan Winer - 21 Jun 2005 15:53 GMT Robert,
> i think everybody here agrees that applying some all-pass filter to your left speaker but not to your right will change something that we can hear. <
Yes, of course.
> are identical except the later has different slowly varying functions < Sure, phase shift is audible *if it is changing* over time. But for two otherwise equal synthesized tones, I doubt you can hear a difference if the harmonics of one are shifted differently than the other.
> i think what Ethan is saying (please correct me if i am misinterpreting, Ethan) is that theta_n(t) can be different than phi_n(t) and we will not hear that difference. <
I wish I understood what that meant.
> it's wavetable synthesis (not to be confused with the PCM < So the basic waves are in RAM/ROM, rather than generated on the fly, but are otherwise used to create new waves additively?
> the promise of additive synthesis was that it could realistically recreate an arbitrary sound <
Sure, given enough computing power.
> the small deviations of some partial from its exact harmonic value can be dealt with by slowly changing the phase. <
Okay, now I see what you're getting at. In traditional synths this is done by changing the "clock out" frequency a bit to detune that one sample on playback. So this is the equivalent of "slowly changing the phase" you mentioned. In that case it's not *static* phase shift anymore, and so could be audible. If it changes the pitch (which it will), then it's definitely audible.
--Ethan
bert stoltenborg - 21 Jun 2005 21:20 GMT If it changes the pitch (which it will), then it's definitely audible.
Amen :-)
robert bristow-johnson - 22 Jun 2005 06:33 GMT >> are identical except the later has different slowly varying functions < > [quoted text clipped - 7 lines] > > I wish I understood what that meant. it means one tone, x(t), has a possibly time-varying phase of phi_n(t) for the nth harmonic and the other, y(t), has a possibly time-varying phase of theta_n(t) for the nth harmonic. if theta_n(t) - phi_n(t) is a constant, then the phase difference between each harmonic of the two tones are constant and i think you are saying that is not audible. i might agree in most normal cases, but with that extreme all-pass filter example, we know there are some cases where a *constant* phase shift (constant in time, the phase shift is different for different frequencies) is added and the change *was* audible. i do not know how far back to pare that extreme example back so that we cannot hear its effect, but because of it i do not see how anyone can accurately claim, out of hand, that adding constant phase shifts to different frequency components is inaudible. a counter-example quickly disproves that. however i do not dispute that there are some cases where these different waveforms sound the same. a quick example is the bandlimited square wave:
x(t) = cos(w0*t) - 1/3 cos(3*w0*t) + 1/5 cos(5*w0*t) - ...
and it's "peakier" counterpart:
y(t) = cos(w0*t) + 1/3 cos(3*w0*t) + 1/5 cos(5*w0*t) + ...
in a perfectly linear sound system, it's unlikely that any could hear the difference, but with a little non-linearity, i know i can (i've tried it with MATLAB).
now if theta_n(t) - phi_n(t) is not constant in time, we know that the instantaneous frequency of the nth harmonic of x(t) (which is n*w0 + (d/dt)phi_n(t)) is going to be somewhat different than the instantaneaous frequency of the nth harmonic of y(t) (which is n*w0 + (d/dt)theta_n(t)). and it looks like there is some agreement that
>> it's wavetable synthesis (not to be confused with the PCM < > > So the basic waves are in RAM/ROM, rather than generated on the fly, but are > otherwise used to create new waves additively? yup. just like the sine wave in a digitally-controlled oscillator (DCO) except the waveforms are not normally a single cycle of a sine (they're a collection of single cycles of some other waveshape) and there is some mechanism to change or evolve the waveform shape as time progresses after the "note on" event.
>> the promise of additive synthesis was that it could realistically recreate >> an arbitrary sound < > > Sure, given enough computing power. and wavetable synthesis takes less real-time computing since the harmonics are all added up in advance. but it isn't the same as "sampling synthesis" which is about as much synthesis as it is when i hit the "play" button on my CD player.
>> the small deviations of some partial from its exact harmonic value can be >> dealt with by slowly changing the phase. < > > Okay, now I see what you're getting at. it's one point. the other was the extreme phase shift APF.
> In traditional synths this is done > by changing the "clock out" frequency a bit to detune that one sample on > playback. well, you still might do that in the case of wavetable synthesis to get the synth to play different notes from a single collection of waveform tables.
> So this is the equivalent of "slowly changing the phase" you > mentioned. In that case it's not *static* phase shift anymore, and so could > be audible. If it changes the pitch (which it will), then it's definitely > audible. the idea is that some harmonics can be detuned up (from their exact harmonic frequency) while others are detuned down and still others are detuned up even farther in the same musical tone. that detuning of the harmonics might be the result of natural or mechanical phenomena and if all of the partials are "flattened" or "deadened" or whatever term you might use to indicate that the partials are strictly set to precisely their integer harmonic value (which is what happens if you set those phases to a constant such as zero), then the "live" note will sound a little dead. again, it is well known that piano "harmonics" or partials get sharper and sharper above their exact integer harmonic frequency value as you get to higher and higher harmonics. if they're not too far off, this can be accommodated with changing the phases (at different rates) in wavetable synthesis.
as bert says, "amen". we've beaten this to death. i don't want to write another letter to the editor at AES about this, but this author just does not seem to "get it" and i'm at a loss why.
 Signature r b-j rbj@audioimagination.com
"Imagination is more important than knowledge."
Ethan Winer - 22 Jun 2005 16:37 GMT Robert,
> "sampling synthesis" which is about as much synthesis as it is when i hit the "play" button on my CD player. <
ROF,L. Good point, and agreed. Next we can make fun of people who buy pre-recorded loops and think stringing them together in a DAW program makes them a "composer."
--Ethan
Ethan Winer - 21 Jun 2005 15:41 GMT Bert,
> I thought we were speaking about fe phase distortion in a x-over between drivers <
Phase shift in a crossover can be audible, but only for frequencies output by both drivers around the crossover frequency. And then, what's audible is the change in frequency response, not the phase shift itself. But was Robert even talking about loudspeakers?
--Ethan
TheGhost - 18 Jun 2005 22:12 GMT > i just got the latest issue of the JAES and i'm a little bit > disgusted. so i thought i might ask for a straw poll. [quoted text clipped - 23 lines] > just asking for off the cuff judgements. expert judgements (that are > supported) are welcome, too. Your question is ill-defined and absurd. Furthermore, scientific truth is not established/discovered by taking a poll among those who have nothing more to offer than ignorant/ininformed opinion. As for the so-called "experts," the only thing that they are able to agree upon, at best, is that they disagree on virtually every aspect of any issue.
robert bristow-johnson - 18 Jun 2005 23:12 GMT
> Your question is ill-defined and absurd. well, Gary, you don't scare me. it was pretty well defined. i asked if:
given a quasi-periodic musical tone: N x(t) = SUM{ r_n(t) * cos(n*w0*t + phi_n(t)) } n=1 and a synthesized approximation to it: N y(t) = SUM{ r_n(t) * cos(n*w0*t + theta_n(t)) } n=1 assuming that you are not listening to the two tones simultaneously, how perceptually relevant do you think that differences between phi_n(t) and theta_n(t) are? "quasi-periodic" means that r_n(t) (which is >= 0) and phi_n(t) and theta_n(t) are all slowly varying functions of time. all are bandlimited to much less than w0. stated differently, |(d/dt)r_n(t)| << w0*r_n(t), |(d/dt)phi_n(t)| << w0, and |(d/dt)theta_n(t)| << w0 . put another way, if theta_n(t) is some slowly varying function that has nothing to do with phi_n(t), can a significant portion of people hear the difference between x(t) and y(t) over identical equipment when not played back simultaneously? is that well enough defined? feel free to define it more precisely or rigorously or concisely, Gary.
> Furthermore, scientific truth is not established/discovered by taking a poll i wasn't asking for (or expecting) definitive scientific truth. that takes some real research and/or psychoacoustic experimentation. i was asking people who might have had some experience with it, what their experience might be or what they might have seen in the lit that has relevance.
> among those who have nothing more to offer than ignorant/ininformed opinion. i'm sure these ininformed ignoramuses here just love you, Gary. you're so popular and well-respected. please forgive me for treading on your valuable and exclusive territory here.
> As for the so-called "experts," the only thing that they are able to agree > upon, at best, is that they disagree on virtually every aspect of any issue. i'm finding that out. that's why i ask for more opinions and ideas for what to look at.
 Signature r b-j rbj@audioimagination.com
"Imagination is more important than knowledge."
The Ghost - 19 Jun 2005 00:10 GMT snip...snip
>... it was pretty well defined. i asked snip....snip...
> is that well enough defined? No, it is not. You are asking a question involving a subjective difference limen having ill-defined and totally unbounded associated physical parameters.
> feel free to define it more precisely or rigorously or concisely, It's your question and your interest, not mine.
snip...snip...
> i wasn't asking for (or expecting) definitive scientific truth. that > takes some real research and/or psychoacoustic experimentation. i was > asking people who might have had some experience with it, what their > experience might be or what they might have seen in the lit that has > relevance. I doubt that anyone here or anyone anywhere can provide a difinite answer to the ill-defined situation that you posed. What I don't understand is why you don't simply perform the experiment yourself, especially in view of its simplicity and the relatively inexpensive hardware and software tools that are currently available.
> i'm sure these ininformed ignoramuses here just love you, Gary. > you're so popular and well-respected. please forgive me for treading > on your valuable and exclusive territory here. It's too bad that you have run out of intellectually substantive arguments and have to resort to ad-hominem attacks.....an unfortuante trait that you've obviously picked up from your buddy Bob Cain.
The Ghost - 19 Jun 2005 01:00 GMT > i'm finding that out. that's why i ask for more opinions and ideas > for what to look at. Perhaps you should consider posting your question at: http://www.auditorymodels.org/ Most of the people there are involved in physiological research of the auditory system, but one never knows who in the psychophysics community may be lurking.
Alternatively, you may want to consider contacting Eckard Blumschein http://home.arcor.de/eckard.blumschein/ You and Eckard would be a good match since he and I disagree on almost every aspect of the mechanical electro-physiological functioning of the auditory system.
bert stoltenborg - 19 Jun 2005 12:37 GMT I agree with Ethan. Experiments with digital impuls corrections of speakers indicated that correcting the impulse (amplitude and phase) gives audible differences, but when the phase is corrected but not the amplitude (maintains the same), no audible differences where heard. This is sustained by research by Floyd Toole (f.e in the Borwick's Loudspeaker and headphone Handbook) and research on headphones (C.A. Poldy), where stationary phase differences where hardly or not detected.
Bert
> > i'm finding that out. that's why i ask for more opinions and ideas > > for what to look at. [quoted text clipped - 10 lines] > every aspect of the mechanical electro-physiological functioning of the > auditory system.
|
|
|