Last Update March 24, 2014
DG at Bash Bish falls 7/22/11
photo: Masumi per Rostad
The big news is the addition of a bit of organization to the top of the page. You will find a link to my YouTube channel, which has three videos on it. The first chronologically was made at a talk to the local section of the Acoustical Society of America. The talk was intended as an introduction to a documentary about the organ builder C.B. Fisk. The introduction tries to make connections between the perception of organ music and how it relates to the acoustics of the venue.
The sound in both talks comes from a recorder in the audience. The organ talk was recorded from my avatar, a fully anthropromorphic copy of my pinna, ear canals, and eardrums. You will hear exactly what I would have heard if I were in the audience. A technical problem setting up the second talk caused this system to fail, but the reliable Zoom H2 worked just fine. I always record from the rear microphones, which sound much better than the front ones because they are angled at 120 degrees instead of 90 degrees.
The second talk is in two parts and goes into much more detail about the physics of sound perception and its relevance to Halls, Operas, and Classrooms. The talk is divided into two sections, each about 40 minutes long. The first section covers the theory with several important audio examples. The second section applies the theory to a great many current concert venues, and a classroom. This is followed by a very interesting question and answer period, where I get a chance to better explain the workings of my acoustic measure called LOC.
I will try to get the powerpoints from these talks up here soon. It is nice to be able to download them with the audio examples – although most of these are available in the previous talks below.
There are also links to a section of biographies – only one so far – and a section with links to some of my music and hall reviews. I think the hall reviews are still quite interesting. I hope you enjoy them.
The next two links are the presentations to the joint ASA-ICA meeting in Montreal and the ISRA in Toronto. The Montreal session was on whether or not it is time to re-visit the ISO3382 analysis methods. The presentation suggests that a revision to the measures for clarity is long overdue, and gives three examples of better ways to do it, and why. The sonic examples are important. They should work if clicked. "What is Clarity and how can it be measured" The Toronto presentation is better in some ways, and is here: "Optimizing loudness, clarity, and engagement in large and small spaces"
There was interest at these meetings in my binaural recordings of concert halls, and in the methods I used to construct the small probe microphones. I updated the slides in the link below on binaural hearing and headphones to include more data on the construction of the microphones. "Binaural Hearing, Ear Canals, and Headphones" I also re-discovered the presentation I gave in Munich in May of 2009 on frequency adaptation. This presentation goes into the details of how the ear localizes sound in the vertical plane, and why this localization fails when timbre is not reproduced exactly. It explains why most headphone sound is perceived inside the head, and why binaural sound can be perceived correctly without head tracking if the recording and the playback is equalized at the eardrum. "Frequency response adaptation in binaural hearing"
One of the startling aspects of my eardrum binaural recordings is the excelent signal to noise ratio. It seems impossible to get such a good result from a 1/4" microphone at the end of a 6cm tube. But if you look in the above link, you will see that the concha and ear canal combine to create a ~20dB peak in the sound pressure at the eardrum just at the frequencies to which we are most sensitive. So the low noise is no surprise at all. Evolution solved the problem for us. Noise from the microphone is audible at higher frequencies where the tube resistance becomes significant. So I apply a bit of noise reduction above 6kHz.
Here are powerpoint slides from a talk I gave at the Lutheries - Acoustique - Musique in Paris a few weeks ago. They will be the basis for a briefer talk at the ASA meeting in Kansas City October 23rd. The title is the same as the the title of the talk below I gave in Germany - but I think the talk is better written, and there are new audio examples that should work if clicked. "Pitch, Timbre, Source Separation"
I received a request for an earlier and a bit more complete version of a powerpoint presentation on dummy heads, headphone reproduction, and binaural hearing. I just read the presentation over, and liked it. In 7/18/13 I made a few corrections, and added more information about how to make small probe microphones. The link is here: "Binaural Hearing, Ear Canals, and Headphones"
Yes, it is Fasching - but things in Detmold are still calm. The talk I gave yesterday is here: "Pitch, Timbre, Source Separation, and the Myths of Sound Localization" They are more oriented toward loudspeaker reproduction than the slides below, but some of the concepts are the same. The .mp3 of the talk is here: "The recording of the talk in Detmold"
The preprint for the ASA conference in San Diego is similar, but I hope better, than the one for the IOA conference in Dublin. "The audibility of direct sound as a key to measuring the clarity of speech and music"
The slides are here:"TheThese slides include the audio examples, which should play after a short delay when clicked. If you want to listen to the lecture while looking at the slides you can click here: "The fifteen minute aural presentation" This can take a minute or two to download and start. You might want to right click on the link and download the file before you click on the link for the slides. Another option is to download "www.davidgriesinger.com/Acoustics_Today/talk.zip" and unzip it into an empty directory. This will download all the audio files, and you can play with them as you like.
My latest preprint presents some of the reasons that although current methods of measuring room acoustics correlate to some degree with the perceived quality of the space, they do not predict that quality with reliability. The preprint was written for the Institute of Acoustics conference in Dublin in May of 2011. The title is not very descriptive of the content. As usual, I submitted the title and a preliminary abstract long before the preprint was written, and by that time I found a better way of getting the point across. "THE RELATIONSHIP BETWEEN AUDIENCE ENGAGEMENT AND THE ABILITY TO PERCEIVE PITCH, TIMBRE, AZIMUTH AND ENVELOPMENT OF MULTIPLE SOURCES"
Acoustic quality has been difficult to define, and it is very difficult to measure something you can't define. Fudamentally the ear/brain system needs to 1. separate one or more sounds of interest from a complex and noisy sound field, and 2. to identify the pitch, direction, distance, and timbre (and thus the meaning) of the information in each of the separated sound streams. Previous research into acoustic quality has mostly ignored the problem of sound stream separation - the fundamental process by which we can consciously or unconsciously select one or more of a potentially large number of people talking at the same time (the cocktail party effect) or multiple musical lines in a concert. In the absence of separation multiple talkers become babble. Music is more forgiving. Harmony and dynamics are preserved, but much of the complexity (and the ability to engage our interest) is lost. Previous acoustic research has focused on how we perceive a single sound source under various acoustic conditions. Previous research has also concentrated primarily on how sound decays in rooms - on how notes and syllables end. But sounds of interest to both humans and animals pack most of the information they contain in the onset of syllables or notes. It is as if we have been studying the tails of animals rather than their heads.
The research presented in the preprint above shows that the ability to separate simultaneous sound sources into separate neural streams is vitally dependent on the pitch of harmonically complex tones. The ear/brain system can separate complex tones one from another because the harmonics which make up these tones interfere with each other on the basilar membrane in such a way that the membrane motion is amplitude modulated at the frequency of the fundamental of the tone (and several of its low harmonics). When there are multiple sources each producing harmonics of different fundamentals, the amplitude modulations combine linearly, and can be separately detected. Reflections and reverberation randomize the phases of the upper harmonics that the ear/brain depends upon to achieve stream separation, and the ampltude modulations become noise. When reflections are too strong and come to early separation - and the ability to detect the direction, distance and timbre of individual sources - becomes impossible. But if there is sufficient time in the brief interval before reflections and reverberation overwhelm the onset of sounds the brain can separate one souce from another, and detect direction, distance, and meaning.
To understand acoustic quality we need to know how and to what degree the information is lost when reflections come too soon. The preprint presents a model for the mechanism by which sources are separated, and suggests a relatively simple metric which can predict from a binaural impulse response whether separation will be possible or not. For lack of a better name, I call this metric LOC - named for the ability to perceive the precise direction of an individual sound source in a reverberant field.
There have been several requests that I put the matlab code for calculating the acoustic measure called LOC on this site. The measure is intended to predict the threshold for localizing speech in a diffuse reverberant field, based on the strength of the direct sound relative to the build-up of reflections in a 100ms window. The name for the measure is a default - If anyone can come up with a better one I would appreciate it. The measure in fact predicts whether or not there is sufficient direct sound to allow the ear/brain to perform the cocktail party effect, which is vital for all kinds of perceptions, including classroom acoustics and stage acoustics.
The formula seems to work surprisingly well for a variety of acoustic situations, both for large and small halls. But there are limitations that need discussion. First, the measure assumes that the speech (or musical notes) have sufficient space between them that reverberation from a previous syllable or note of similar pitch does not cover the onset of the new note. In practice this means the even if LOC is greater than about 2dB, a sound might not be localizable, or sound "Near" if the reverberation time is too long. A lengthy discussion with Eckhard Kahle made clear that the measure will also fail - in the opposite sense - if there is a specular reflection that is sufficiently stronger than the direct sound. This can happen if the direct sound is blocked, or absorbed by audience in front of a listener. In this case the brain will detect the reflection as the direct sound, and be able to perform the cocktail party effect - but will localize the sound source to the reflector. With eyes open a listener is unlikely to notice the image shift.
With these reservations, here is the matlab code. It accepts a windows .wav file, which should be a stereo file of a binaural impulse response with the source on the left side of the head. The LOC code analyzes only the left channel. There is a truncation algorithm that attempts to find the onset of the direct sound. This may fail - so users should check to be sure the answer makes sense. The code also plots an onset diagram, showing the strength of the direct sound and the build-up of reflections. If this looks odd, it probably is. In this version of the code the box plot has a Y axis that starts at zero. The final level of a held note (the total energy in the impulse response) is given the value of 20dB, and both the direct level and the build-up of reflections are scaled to fit this value. The relative rate of nerve firings for both components can then be read off the vertical scale. So - if the direct sound (blue line) is at 12, you know the eventual D/R is -8dB. Rename the .txt file to .m for running in Matlab. "Matlab code for calculating LOC"
My latest work on hearing involves the development of a possible neural network that detects sound from multiple sources through phase information encoded in harmonics in the vocal formant range. These harmonics interfere with each other in frequency selective regions of the basilar memebrane, creating what appears to be amplitude modulated signals at a carrier frequency of each critical band. My model decodes these modulations with a simple comb filter - a neural delay line with equally spaced taps, each sequence of taps highly selective of individual musical pitches. When an input modulation - created by the interference of harmonics from a particular sound source - enters the neural delay line, the tap sequence closest in period to the source fundamental is strongly activated, creating an independent neural stream of information from this source, and ignoring all the other sources and noise. This neural stream can then be compared to the identical pitch as seen in other critical bands to determine timbre, and between the two ears to determine azimuth (localization) of this source.
The neural network is capable of detecting the fundamental frequency of each source to high accuracy (1%) and then using that pitch to separate signals from each source into independent neural streams. These streams can then be separately analyzed for pitch, timbre, azimuth, and distance. One of the significant features of this model is the insight it brings to the problem of room acoustics. All the relevant information extracted by this mechanism is dependent on the phase relationships between upper harmonics. These phase relationships are scrambled in predicable ways by reflections and noise. In room acoustics most of the scrambling is done by reflections and reverberation. These come later than the direct sound (which contains the unscrabled harmonics. If the time delay is sufficent the brain can detect the properties of the direct sound before it is overwhelmed by early reflections and reverberation.
By studying this scrambling process through the properties of my neural model we can predict and measure the degree to which pitch, azimuth, timber, and distance of individual sources is preserved (or lost) in a particular seat in a particular hall. The model has been used to accurately predict the distance from the sound source that localization and engagement is lost in individual halls, using only binaural recordings of live music. This distance is often discouragingly close to the sound sources, leaving the people in most of the seats to cope with much older and less accurate mechanisms for appreciating the complexities of the music.
The network, along with the many experiences the author has had with the distance perception of "near" and "far" and its relationship to audience engagement, is contained in the three preprints written for the International Conference on Acoustics (ICA2010) in Sydney. The effects were demonstrated (to an unfortuantely small audience) at the ISRA conference in Melbourne, which concluded on August 31, 2010. The three preprints are available on the following links. "Phase Coherence as a measure of Acoustic Quality, part one: The Neural Network"
I believe this model is of high importance for both the study of hearing and speech, and for the practical problem of designing better concert halls and opera houses. The model - assuming it is correct - shows that the human auditory mechanism has evolved over millions of years for the purpose of extracting the maxumum amount of information from a sound field in the presence of non-vital interference of many kinds. The information most needed is the identification of the pitch, timber, localization, and distance of a possibly life threatening source of sound. It makes sense that most of this information is encoded in the sound waves that reach the ear in the harmonics of complex tones - not in the fundamentals. Most background noise is inversely proportional to frequency in its spectrum, and thus is much stronger at low frequencies than at high frequencies. But in addition, the harmonics - being at higher frequencies - contain more information about the pitch of the fundamentals than the fundamentals themselves, and are also easier to localize, since the interaural level differences at high frequencies are much higher than at low frequencies.
the proposed mechanism is capable of separating the pitch information from multiple sources into independent neural streams, allowing a person's consciousness to choose among them at will. This is the well known "cocktail party effect" where we can choose to listen to one out of four or more simultaneous conversations. The proposed mechanism solves this fundamental problem, as well as providing the pitch acuity of a trained musician. The way it detects pitch also explains many of the unexplained properties of our perception of music and harmony.
September 2 found me in Berlin working on a LARES system in the new home of the Deutches Staatsoper Berlin in the Schillertheater. This was a real learning experience! The Staatsoper wanted the 20+ year old LARES equipment to be re-installed in the new location, and with the help of Muller-BBM we succeeded. Barenboim was satisfied. I was aching to install our newest version of the LARES process, but final hardware is not ready. I was able to patch in a prototype late at night, and it performed very well indeed. I could raise the reverberation time of the hall well beyond two seconds at 1000Hz with no hint of coloration. The new hardware should be available soon, and I can't wait.
On September 8th I had the honor of a two hour lecture at Muller-BBM GMBH in Munich. The slides for this lecture contain updated Matlab code for LOC measure for localization. The code will open a Microsoft .wav file that contains a binaural impulse response. The file is assumed to be two-channel. The program then calculates LOC assuming that the sound source is on the left side. The code also plots a graphic of the number of nerve inpulses from the direct sound and the number of nerve impulses from the reverberation. The slides are available here: "Listening to Acoustics"
The slides presented in Sydney and Melbourne are mostly the same as the ones in the following link, which were presented to audiences in Boston, and Washington DC. These slides are more extensive than the ones for Sydney, as more time was available for the presentation. A relatively complete explanation of the author's equation for the degree of localization and engagement, along with Matlab code for calculating the parameter are included in the slides. "The Relationship Between Audience Engagement and Our Ability to Perceive the Pitch, Timbre, Azimuth and Envelopment of Multiple Sources" contains the slides from the talks given in Boston and Washington DC on the subject.
The purpose of the research into hearing described in these slides is not necessarily to discover the exact neurlogy of hearing, but to understand why and how so many of our most important sound perceptions are dependent on acoustics - particularly on the strength and time delay of early reflections. Early reflections are currently presumed to be beneficial in most cases, but my experience has shown that when there are excessive early reflections they strongly detract from both the psychological clarity and engagment of a sound. The slides are pretty hard-hitting. They can be read together with the slides from the previous talk below - from which they partially borrow.
Thanks to Professor Omoto and his students at Kyushu University for pointing out that the equation for LOC in "The importance of the direct to reverberant ratio in the perception of distance, localization, clarity, and envelopment" was nonsense. I had inverted the order of the integrals and the log function. Many thanks to Omoto-San. I corrected the slide, and added some explanations. I also added MATLAB code for calculating it, which might help in trying to understand it.
I have been recently writing some reviews and a brief note on reverberation for the Boston Music Intellegencer - www.classical-scene.com. You might want to check out the site. As part of the effort to understand how the ear hears music I am planning several presentations at the ICA conference in Sydney, and a listening demonstration seminar in Melbourne this summer.
A sort-of recent addition to this site is a brief note on the relationship beteen the perception of "running liveness" on the independent perception of direct sound. My most recent experiments (with several subjects) show that envelopment - and the sense of the hall - is greatly enhanced when it is possible to precisely localize instruments. This occurs even when the direct sound is 14dB or more lower in energy than the reflected or reverberant energy. The note was written as a response to a blog entry by Richard Talaske. The link to Talaske's blog is included in the note. "Direct sound, Engagement, and Running Reverberation"
My latest research has focused on the question of audience engagement with music and drama. I was tought to appreciate this psychological phenomenon by several people - including among others Peter Lockwood (assitant conductor in the Amsterdam Opera), Michael Schonewandt in Copenhagen, and the five major drama directors in Copenhagen that participated in an experiment with a live performance several years ago. The idea was given high priority after a talk by Asbjørn Krokstad at the IoA conferent in Oslo last September. Krokstat gave me the word engagement to describe the phenomenon, and spoke of its enormous importance (and current neglect) in the study and design of concert halls and operas.
His insight motivated the talk I gave in Brighton the following month, also courtesy of the IoA. But the work was far from done. If one thinks one has a vitally important perceptual phenomenon that most people are unaware of, it is essential to come up with a mathematical measure for it. I was not sure this was possible, but went to work anyway. Just before being scheduled to give another talk on this subject in Munich for the Audio Engineering Society I came up with a possible solution, a simple mathematical equation that predicts the threshold for detecting the localization (azimuth) of a sound source in a reverberant field. You plug in an impulse response - and it gives you an answer in dB. If the value is above zero, I can localize the sound. If it is below 0, I cannot. Whether this equation works for other people I do not know, but Ville Pulkke and his students may be interested in finding out. Although azimuth detection is NOT the same as engagement, I believe it is a vital first step. For me, engagement seems to correspond to a result of the azimuth detection equation returning a value above +3dB.
The talk - there is no paper yet - that describes this work (and a great deal besides) is here: "The importance of the direct to reverberant ratio in the perception of distance, localization, clarity, and envelopment" The talk here is the one I presented at the Portland meeting of the ASA, and is slightly different from the one in Munich. It contains audible examples that should work when clicked. The Munich talk is here: "The importance of the direct to reverberant ratio... Munich" I found some errors in the original presentation - this is correct as of 5/14/09
There is a preprint available from the AES that contains most of the talk - but does not include the new equation. "The importance of the direct to reverberant ratio in the perception of distance, localization, clarity, and envelopment"
In January I traveled to Troy, NY to talk with Ning Xiang and his students about my current research. Part of the result was a renewed interest in publishing my work with the localization mechanisms used in binaural hearing, and due to these mechanisms the importance of doing measurements at the eardrum and not at the opening of the ear canal - or even worse, with a blocked ear canal. This talk was recently revised for presentation at the Audio Engineering Society Convention in Munich. The link here is to the version as of May 9, 2009. "Frequency response adaptation in binaural hearing"
As part of this effort I went back and scanned my 1990 paper on sound reproduction with binaural technology. I found the paper to be long, wordy, and far to full of ideas to be very useful. But for the most part, I think it is very interesting and correct. Where I disagree with it now I added some comments in red. I think it makes interesting reading if you can stick with it. "Binaural Techniques for Music Reproduction" This paper says a great many correct things about binaural hearing, and agrees closely with my current work, coming to the identical conclusions - namely that blocked ear canal measurements, and partially blocked ear canal measurements do not properly capture the response of headphones. You have to measure at the eardrum - or as I have recently developed - use loudness comparisons to find the response of headphones at the eardrum. The use of a frontal sound source as a reference is also recommended, both for headphone equalization and for dummy head equalization. I will get a copy of the AES journal paper on dummy head equalization on this site in the near future.
In November I traveled to the Tonmeister Tagung in Leipzig, followed by the IOA conference on sound reproduction in Brighton. In Leipzig I gave two lectures, one on the importance of direct sound in concert halls, and another on headphones and binaural hearing. The talk on concert halls was repeated in Brighton in a longer form as the Peter Barnett memorial lecture. I was able to demonstrate many of the perceptions with 5.1 recordings in Brighton, thanks to Mark Bailey. "The importance of the direct to reverberant ratio in the perception of distance, localization, clarity, and envelopment"
The concert hall paper had an important message: That current research on concert halls has tested the case where the direct sound is stronger than the sum of all the reflected energy. This domain is appropriate to recordings - but is inappropriate to halls, where the energy in the direct sound is far less than the energy in the reverberation in most of the seats. When laboratatory tests are conducted using realistic levels of direct sound, very different results emerge.
We find that in this case when the brain can separately detect the direct sound in the presence of the reverberation the music or the drama is enhanced, and audience involvment is maximized. But there must be a time gap between when the direct sound arrives at a listener and the onset of the reverberation if this detection is to take place. In a large hall this gap generally exists, and clarity and involvement can occur over a range of seats. In a small hall the gap is reduced, and the average seating distance must be closer to the sound source if clarity is to be maintained.
Conclusion: Don't build shoebox halls for sizes under 1000 seats!
This paper presentation included some wonderful 5.1 demos of the sound of halls and small halls with different D/R ratios and time gaps. I will include these here - but not yet. Stay tuned.
The paper on headphones goes into detail about the mechanisms the brain uses for detecting timbre and localization. It includes some interesting data on the unreliability of blocked ear canal measurements of HRTFs. Highly recommended to anyone who wants to know why head-tracking is assumed to be necessary for binaural reproducition over headphones. An obvious conclusion: if head-tracking is necessary, you KNOW the timbre is seriously in error. "Frequency response adaptation in binaural hearing"
A Finish student requested that I add the lecture slides for a talk I gave in October 2008 to the Finish section of the Audio Engineering Society. The subject was "Recording the Verdi Requiem in Surround and High Definition Video." These slides can be found with this link. However I notice that I had previously put most of these slides on this site - together with some of the audio examples you can click to hear. This version can be found below in the link to the talks give to the Tonmeister conference in 2006.
Last September I gave a paper at the ICA2007 in Madrid, which presented data from a series of experiments that I had hoped would lend some clarity to the question of why some concert halls sound quite different from others, in spite of having very similar measured values of RT, EDT, C80 and the like. A related question was how can we detect the azimuth of different sound sources when the direct sound - which must carry the azimuth information - is such a small percentage of the total sound energy in most seats?
The experiments first calculated the direct/reverberant ratio for different seats in a model hall (with sanity checks from real halls) and then looked for the thresholds for azimuth detection as a function of d/r. Some very interesting results emerged. It seems that the delay time between the arrival of the direct sound and the time that the energy in the early reflections had built up to a level sufficient to mask the direct sound was absolutely critical to our ability to obtain both azimuth and distance information. In retrospect this point is obvious. Of course the brain needs time to separately perceive the direct sound and the information it contains - else this information cannot be perceived.
The necessity of this time delay has consequences. It means that as the size of a hall decreases the direct to reverberant ratio over all the seats must increase if sounds are to be correctly localized, and the clarity that comes with this localization is to be maintained. To achieve this goal, it is desirable in small halls that the audience sits on average closer to the musicians, and the volume needed to provide a longer reverberation time be obtained through a high ceiling, and not through a long, rectangular, hall. Halls less than 1500 seats should probably look more like opera houses with unusually high ceilings. They should not be shoebox in shape. Fortunately, Boston has such a hall: Jordan Hall at New England Conservatory (1200 seats) is the best hall I know of worldwide for chamber music or small orchestral concerts. It has a near opera house shape with a single high balcony, and a very high ceiling. The sound is clear and precise in nearly every seat, with a wonderful sense of surrounding reverberaton. Musicians love it, as does the local audience.
Alas, the paper went over like a lead balloon. It was way too dry, and it seemed no one understood how it could possibly be relevant to hall design. So I quickly re-did the paper I was to present in Seville the next week to answer the many questions I received after the one in Madrid. The powerpoints for this presentation are here: "Why do concert halls sound different – and how can we design them to sound better?" Hopefully this gets the ideas across better.
At the request of a reader I have put the sound files associated with the lecture at the RADIS conference which followed ICA2004. "The sound files for RADIS2004" These include several (pinna-less) binaural recordings from opera houses, and both music and speech convolved with impulse responses published by Beranek. The lecture slides can be found with the following link: Slides for RADS2004
David Griesinger is a physicist who works in the field of sound and music. Starting in his undergraduate years at Harvard he worked as a recording engineer, through which he learned of the tremendous importance of room acoustics in recording technique. After finishing his PhD in Physics (the Mösbauer Effect in Zinc 67) he developed one of the first digital reverberation devices. The product eventually became the Lexicon 224 reverberator. Since then David has been the principle scientist at Lexicon, and is chiefly responsible for the algorithm design that goes into their reverberation and surround sound products. He also has conducted research into the perception and measurement of the acoustical properties of concert halls and opera houses, and is the designer of the LARES reverberation enhancement system.
The purpose of this site is to share some of my publications and lectures. Most of the material on the site was written under great time pressure. The papers were intended as preprints for an aural presentation. Some of them are available as published preprints from the Audio Engineering Society. The papers should be considered drafts - they have not been peer reviewed.
Several power point presentations have also been added to the site. These presentations are often quite readable and informative. In general they are more coherent than the preprint of the same material, although they are naturally not as detailed. They can be read in conjunction with the preprint of a talk. These are available as .pdf downloads, although your browser may allow you to open them with Adobe Acrobat.
Some recent work includes:
"Perception of Concert Hall Acoustics in seats where the reflected energy is stronger than the direct energy - or… Why do Concert Halls sound different - and how can we design them to sound better? - A poster paper presented at the AES conference in Vienna, May 7, 2007. "Two powerpoints given at the Tonmeistertagung Nov. 2006, and the AES convention October 2006" The subjects are "Distance Effect and Muddiness" and "Recording the Verdi Requiem in Surround Sound and High-Definition Video" These lectures are presented in a .zip file so the audio examples can be included. It is recommended you unzip the package in a single directory, and then view the slides with powerpoint. The audio examles should work when clicked.
Recent work on headphone calibration [October 2008] - done partly with the students of Ville Pulkki at Helsinki University - has shown that all the types of headphones we tested have large variations in frequency response for different individuals AND that these differences make for very different perceptions of the sound quality of binaural recordings from halls. It is possible to compensate for these errors by matching headphones to listeners through noise band loudness matching. Thus before listening to the music examples in the above papers using headphones or typical computer speakers, please read the following: "The necessity of headphone equalization" This link has been substantially re-written. Simple loudness matching is not sufficient. A reference loudspeaker is required - and loudness differences must be compensated.
"Pitch Coherence as a Measure of Apparent Distance and Sound Quality in Performance Spaces" This is a preprint of a poster paper presented at the Institut of Acoustics conference in Copenhagen, May 2006. The preprint contains some audio examples, that can be heard by clicking on the links.
An effort to clean house resulted in scanning a few old papers: "Reducing Distortion in Analog Tape Recorders" JAES March 1975, "The Mossbauer Effect in Zn67" Phys Rev B vol 15 #7 p 3291 April 1 1977, and "Spaciousness and Localization in Listening Rooms - How to make Coincident Recordings Sound as Spacious as Spaced Microphone Arrays" JAES v34 #4 p255-268 April, 1986. I still find the paper on spaciousness to be interesting and insightful. The observations continue to be relevant to my work, particularly on the subject of the intereaction between loudspeakers and rooms at low frequencies. It took many years before I was able to explain the observed effects. For example, Dutton's work on the apparent localization errors with conventional stereo loudspeakers at high frequencies are fully explained in "Stereo and Surround Panning in Practice" The effects of low frequency room modes on spatial properties are fully explained in “Loudspeaker and listener positions for optimal low-frequency spatial reproduction in listening rooms”
The early paper on spaciousness and localization is much too optomistic about the possiblity of increasing the spaciousness of a listening room through increasing the low frequency separation. In most rooms where the low frequency modes do not correctly overlap the low frequency separation is inaudible. Increasing it only stresses the loudspeakers. The best solution is to drastically change the loudspeaker positions, or change the room dimensions.
New in October of 2005 are the slides from a lecture on recording technique,
given in Japan for the Audio Engineering Society, and in Schloss
Hohenkammer for the Tonmeister
conference in October.
I have also finally added "Griesinger's Coincident Microphone
Primer", a paper from 1987 that describes and mathematically analyzes much
of the behavior of concident microphone arrays, including
the Soundfield microphone.
New in May of 2005 is a paper on room dimensions and
loudspeaker placement for the reproduction of envelopment at low frequencies,
presented at the acoustical society meeting in
Progress in 5-2-5 Matrix Systems
performance of multichannel time variant reverberation enhancement
systems" The proceedings of the 1995 International Symposium on Active
Control of Sound and Vibration,
David Griesinger is a physicist interested in sound - the sound of music. He is particularly interested in translating subjective impressions of sounds into the physics of sound propagation, and the psychoacoustics of sound perception. He has found that although it is wonderful to discover ways to improve the quality of a reproduced sound, it is far more useful and powerful to understand exactly how the improvement was achieved.
This interest started with
work as a recording engineer. Through college and graduate school I recorded
concerts and made records for student organizations. The need for better
microphones led to work in microphone design and construction, Starting in 1964 with the construction of omnidirectional
condenser microphones. In about 1985 I designed and constructed a miniature
Soundfield microphone (16mm diameter), and in about 1990 made a dummy head
microphone for classical recordings. Most of the work on microphones has not
been described in publications, but a paper did appear in the Journal of the
Audio Engineering Society on the equalization of dummy head microphones. This
paper is also available in German from the Deutsche Tonmeister
der Lautsprechercompatibilität von Kunstkopfaufnahmen durch herkömliche und räumliche
Entzerrung" Bericht der 15. Tonmeistertagung,
Early work in this field produced a paper in the Audio Engineering Society journal on distortion reduction in magnetic tape recorders, and a paper on image localization (as a function of frequency) from two channel sound equipment in small rooms.
The work as a recording engineer also led to an abiding interest in artificial reverberation, and this eventually resulted in the development of the Lexicon digital reverberation devices. Alas, due to problems with trade secrets this work remains unpublished.
About in 1990 I started installing reverberation units in spaces used for musical performances, in an effort to improve the acoustics for live performances. This work led eventually to the development of the LARES system for acoustic enhancement. This work is described in the paper "Improving Halls and Rooms with Multiple Time Variant Reverberation" which is on this site. Unfortunately this paper is not yet available to me with machine readable drawings, and is presented here without them. Perhaps eventually we will have the complete paper. I still consider this paper a classic - although the precise method of randomizing the reverberation devices is deliberately not described (sorry... you have to draw the line somewhere.)
LARES works wonderfully well - but I learned quite quickly that conventional acoustical measurement techniques were useless for describing its performance. The glaring mismatch between what you could easily hear in a hall and the measurements one could make resulted in a serious study into the perception of acoustics. A flurry of papers resulted - all more or less wrong.
I also did considerable work on technical methods of room measurement. At least two interesting papers resulted - see the 1992 paper " Impulse response measurements using All-Pass deconvolution and the later paper on occupied hall measurement. Beyond MLS - Occupied Hall Measurement With FFT Techniques I am actually quite proud of both papers. The all pass deconvolution method is amazingly clever and efficient. You simply play this strange time-reversed signal into the room, and play the result through a simple reverberator. Instant impulses result - quite amazing. The sweep method is actually much more effective, but far less clever.
Conventional measures were clearly missing the point - but for a long time, so was I. About ten years ago this work started to converge into a coherent (at least I think it is coherent) hypothesis about how we perceive the acoustics of enclosed spaces.
It turns out that acoustic perception relies on two very different phenomena. The most basic is the detection of reflected energy by the hearing system. This detection relies on fluctuations in the Interaural Time Delay (ITD) and the Interaural Intensity Difference (IID). The fluctuations are caused by interference between the direct sound from a source, and delayed reflected sound. The creation of fluctuations is a physical process - it can be easily modeled and predicted.
The other piece of the puzzle takes place much later in the neural process, and is related to the process of separating incoming sound events into related streams of information, such as the syllables of speech from a single person. It turns out there is neurology for this separation process. This neurology organizes sound events into one or more foreground streams. But there is also neurology that keeps track of the loudness and the sound direction of background sound in the spaces between sound events. Our perception of the background also forms a stream - but this one is perceived as continuous, and has specific spatial properties. The neurology associated with the background stream is the primary source of our perception of musical envelopment, and so the spaces between musical notes are vital to this perception. The separation of the background stream from the foreground stream takes time. Reflected energy that arrives too soon after the end of a note is perceived as part of the note itself, and does not contribute to envelopment. It is only after 100ms or more that reflected energy really is heard as background reverberation, and understanding this time delay is vital to understanding how halls and operas are perceived with music. The whole hypothesis is best described in the July 1997 article in ACTA Acustica. The same material is contained in a somewhat longer preprint for JAES. "Spaciousness and envelopment in musical acoustics." The JAES preprint also includes a section on how the hypothesis applies to the practical improvement of halls and operas. This part has not yet appeared in Acustica.
The concept of interaural
fluctuations has been used to solve a very old riddle - the riddle of how many
independent bass drivers one needs in a sound system in a small room, and where
should these drivers be put. To make a long story
short: you need at least two low frequency drivers, and ideally they should be
at either side of the listeners. This work is described in the papers on small
room acoustics. The latest paper on this subject is the one presented in
Vancouver in 2005: “Loudspeaker and listener positions for optimal
low-frequency spatial reproduction in listening rooms” This
paper is highly recommended. Others include:
Speaker placement, externalization, and envelopment in home listening rooms
General overview of spatial impression, envelopment, localization, and externalization
Much of the work described in the
above paper was done using the MATLAB language. Hardcore researchers might be
interested in the Code that was used. This is available with NO instructions,
in the Following file. Please email the author if you wish to use this code.
For this purpose, use the email address in the picture.
The .zip compressed Matlab code for experiments with DFT and externalization. Requires a Working MATLAB C compiler to be practical.
The site also includes a recent paper
on reverberation enhancement. Be sure to check out the lecture slides for this
paper - they are much more interesting.:
Recent experiences with electronic acoustic enhancement in concert halls and opera houses
The next item is the lecture notes
for a workshop at the September 1999 Audio Engineering Convention. In this
workshop I had about two hours to cover the essentials of recording technique
for surround sound. It was a lot of fun - but a great deal of what was said is
not in the notes. I believe the AES made a cassette recording. This might be
Lecture notes from the September 27, 1999 AES workshop
The following lecture slides were
presented at the meeting of the Acoustical Society in
Recent experiences with electronic acoustical enhancement in concert halls, opera houses, and outdoor venues - the lecture slides without pictures.
The AES conference in October 2000
was fun, but the slides were prepared in more than the usual rush. Basically nothing new here, particularly in the first one.
Diehard fans might get something out of the second, but the
Alas, in most cases with large forces the actual level of the "support" microphones in the final mix is larger than the level of the "main" microphone, so in practice the roles are reversed. Nothing intrinsically wrong with this confusion - but it leads to some rather bizarre recommendations, such as delaying the output of the "support" microphones so the time of arrival of the wavefront comes after the signal from the "main" microphone. The remarkable thing is that adding such a delay does not sound as strange as one might expect. But in my experience it always sounds worse than no delay at all. Once again the lecture slides may give the better picture, but you may want to look at both the preprint and the slides.
"Perceptual Modeling" was a term invented by one of our advertising agents to describe the design of the reverberation controls in the Lexicon 960. I don't think it means anything at all, which is good for marketing. But the above paper is quite a useful description of how to use reverberation to control the apparent distance of a sound source. We have been doing this with our products for years of course, and the process is well described in the Lexicon 480 manual with the "ambience" algorithm. However, outside the manuals I made no real attempt to publish the concepts, leading to some rather interesting claims by others of having discovered it all.
An interesting issue came up at this Tonmeister conference. Gunther Theile played a tape made by some of his students, where they compared the hall pick-up from four omni directional microphones spaced in a square array at different distances. Unfortunately I was unable to understand exactly the conditions of the experiment, but the closest set of microphones used a spacing of ~25cm. In a quick listening test in the listening room at the show, with about 50 people present, the closest spacing seemed to be preferred generally over the wider spacings.
The result seems to contradict an assumption that I make in nearly all the work I have done - that uncorrelated reverberation sounds better than correlated reverberation. The reasons for this result are unclear. The suggestion offered at the time - that the closer spacing allowed better imaging of the sides of the room - seems unlikely, among other things for the fact that side imaging does not exist for a forward facing listener. In an effort to resolve this issue - which I take to be of the highest importance - I wrote a note to Eberhard Sengpiel. The note is included here for those who think the issue is as important as I do.
The next paper, on stereo and surround panning in practice, is pretty good, I think. I wrote it because we were having difficulty preserving the apparent horizontal direction (azimuth) of sound sources in a two channel stereo image when we converted the two channel to 5 or 7 channels with the Logic 7 algorithm. This is interesting because L7 was designed assuming the standard sine/cosine pan law to be correct. We detected the left/right balance of a front sound source, and used the sine/cosine law to find the azimuth. We then adjusted the balance in the three front channels to present it with the same azimuth. Alas, this does not work. We traced the problem to the two channel sine/cosine pan law, which is seriously wrong for most musical sources. (Curiously, the three channel version - that is panning from a center speaker to either left or right - works quite well.) The reason for all this is to be found in binaural theory. Turns out in two channel panning the perceived azimuth is highly frequency dependent, with frequencies above 1000Hz sounding much wider than the sine/cosine law would predict. Suitable averaging over frequency gets the right answer. For some reason this paper has remained undeservedly obscure.
And now for something completely different... Being currently over 60, and having in my youth studied information theory, I have a low tolerance for claims that "high definition" recording is anything but a marketing gimmick. I keep, like the Great Randi, trying to find a way to prove it. Well, I got the idea that maybe some of the presumably positive results on the audibility of frequencies above 18000Hz were due to intermodulation distortion, that would covert energy in the ultrasonic range into sonic frequencies. So I started measuring loudspeakers for distortion of different types - and looking at the HF content of current disks. The result is the paper below, which is a HOOT! Anytime you want a good laugh, take a read.
we get to something REALLY new. I had been working for some time on ways of
measuring hall acoustic properties from binaurally recorded speech. It turns
out to be pretty simple to learn a lot about LATERAL reflections from a running
IACC. But medial reflections are trivial to hear in speech and music (at least
when they approach the energy of the direct sound) and the detection process
(whatever it is) is very robust. I decided to submit a preprint without knowing
how to solve this problem, figuring the pressure of the due date might make
some progress. Sure enough, the due date came around, with no solution. Two
weeks of very hard work... and I had an answer. Turns out, we detect medial
reflections through their effect on the audibility of pitch! This ability (on signals from a single source
with a defined pitch) is, I believe, the primary distance cue. Ultimately I believe the methods shown here
will lead to a new (and quite useful) measure for sound quality of rooms. The
paper for the
I am deeply honored that Leo Beranek chose me to share his lecture to the Acoustical Society, in honor of it's 50th birthday, and Leo's 90th. Spurred on by the thought that I could easily make an ass of myself, I put together a pretty good lecture. Highly recommended.
Bill Martins put together a little dog and pony show about low frequency spatial reproduction in small rooms. In honor of this I made this paper on how to determine the optimal room dimensions and speaker placement for spatial reproduction at low frequencies. You can do it in an hour on the back of an envelope if you know the room dimensions. Turns out square rooms are pretty much impossible. If you have one, tear it down!