Last Update October 31, 2011
DG at Bash Bish falls 7/22/11
photo: Masumi per Rostad
10/31/11
The preprint for the ASA conference in San Diego is similar, but I hope better, than the one for the IOA conference in Dublin. "The audibility of direct sound as a key to measuring the clarity of speech and music"
The slides are here:"The audibility of direct sound as a key to measuring the clarity of speech and music" These slides include the audio examples, which should play after a short delay when clicked.
7/25/11
My latest preprint presents some of the reasons that although current methods of measuring room acoustics correlate to some degree with the perceived quality of the space, they do not predict that quality with reliability. The preprint was written for the Institute of Acoustics conference in Dublin in May of 2011. The title is not very descriptive of the content. As usual, I submitted the title and a preliminary abstract long before the preprint was written, and by that time I found a better way of getting the point across. "THE RELATIONSHIP BETWEEN AUDIENCE ENGAGEMENT AND THE ABILITY TO PERCEIVE PITCH, TIMBRE, AZIMUTH AND ENVELOPMENT OF MULTIPLE SOURCES"
Acoustic quality has been difficult to define, and it is very difficult to measure something you can't define. Fudamentally the ear/brain system needs to 1. separate one or more sounds of interest from a complex and noisy sound field, and 2. to identify the pitch, direction, distance, and timbre (and thus the meaning) of the information in each of the separated sound streams. Previous research into acoustic quality has mostly ignored the problem of sound stream separation - the fundamental process by which we can consciously or unconsciously select one or more of a potentially large number of people talking at the same time (the cocktail party effect) or multiple musical lines in a concert. In the absence of separation multiple talkers become babble. Music is more forgiving. Harmony and dynamics are preserved, but much of the complexity (and the ability to engage our interest) is lost. Previous acoustic research has focused on how we perceive a single sound source under various acoustic conditions. Previous research has also concentrated primarily on how sound decays in rooms - on how notes and syllables end. But sounds of interest to both humans and animals pack most of the information they contain in the onset of syllables or notes. It is as if we have been studying the tails of animals rather than their heads.
The research presented in the preprint above shows that the ability to separate simultaneous sound sources into separate neural streams is vitally dependent on the pitch of harmonically complex tones. The ear/brain system can separate complex tones one from another because the harmonics which make up these tones interfere with each other on the basilar membrane in such a way that the membrane motion is amplitude modulated at the frequency of the fundamental of the tone (and several of its low harmonics). When there are multiple sources each producing harmonics of different fundamentals, the amplitude modulations combine linearly, and can be separately detected. Reflections and reverberation randomize the phases of the upper harmonics that the ear/brain depends upon to achieve stream separation, and the ampltude modulations become noise. When reflections are too strong and come to early separation - and the ability to detect the direction, distance and timbre of individual sources - becomes impossible. But if there is sufficient time in the brief interval before reflections and reverberation overwhelm the onset of sounds the brain can separate one souce from another, and detect direction, distance, and meaning.
To understand acoustic quality we need to know how and to what degree the information is lost when reflections come too soon. The preprint presents a model for the mechanism by which sources are separated, and suggests a relatively simple metric which can predict from a binaural impulse response whether separation will be possible or not. For lack of a better name, I call this metric LOC - named for the ability to perceive the precise direction of an individual sound source in a reverberant field.
6/20/11
There have been several requests that I put the matlab code for calculating the acoustic measure called LOC on this site. The measure is intended to predict the threshold for localizing speech in a diffuse reverberant field, based on the strength of the direct sound relative to the build-up of reflections in a 100ms window. The name for the measure is a default - If anyone can come up with a better one I would appreciate it. The measure in fact predicts whether or not there is sufficient direct sound to allow the ear/brain to perform the cocktail party effect, which is vital for all kinds of perceptions, including classroom acoustics and stage acoustics.
The formula seems to work surprisingly well for a variety of acoustic situations, both for large and small halls. But there are limitations that need discussion. First, the measure assumes that the speech (or musical notes) have sufficient space between them that reverberation from a previous syllable or note of similar pitch does not cover the onset of the new note. In practice this means the even if LOC is greater than about 2dB, a sound might not be localizable, or sound "Near" if the reverberation time is too long. A lengthy discussion with Eckhard Kahle made clear that the measure will also fail - in the opposite sense - if there is a specular reflection that is sufficiently stronger than the direct sound. This can happen if the direct sound is blocked, or absorbed by audience in front of a listener. In this case the brain will detect the reflection as the direct sound, and be able to perform the cocktail party effect - but will localize the sound source to the reflector. With eyes open a listener is unlikely to notice the image shift.
With these reservations, here is the matlab code. It accepts a windows .wav file, which should be a stereo file of a binaural impulse response with the source on the left side of the head. The LOC code analyzes only the left channel. There is a truncation algorithm that attempts to find the onset of the direct sound. This may fail - so users should check to be sure the answer makes sense. The code also plots an onset diagram, showing the strength of the direct sound and the build-up of reflections. If this looks odd, it probably is. In this version of the code the box plot has a Y axis that starts at zero. The final level of a held note (the total energy in the impulse response) is given the value of 20dB, and both the direct level and the build-up of reflections are scaled to fit this value. The relative rate of nerve firings for both components can then be read off the vertical scale. So - if the direct sound (blue line) is at 12, you know the eventual D/R is -8dB. Rename the .txt file to .m for running in Matlab. "Matlab code for calculating LOC"
My latest work on hearing involves the development of a possible neural network that detects sound from multiple sources through phase information encoded in harmonics in the vocal formant range. These harmonics interfere with each other in frequency selective regions of the basilar memebrane, creating what appears to be amplitude modulated signals at a carrier frequency of each critical band. My model decodes these modulations with a simple comb filter - a neural delay line with equally spaced taps, each sequence of taps highly selective of individual musical pitches. When an input modulation - created by the interference of harmonics from a particular sound source - enters the neural delay line, the tap sequence closest in period to the source fundamental is strongly activated, creating an independent neural stream of information from this source, and ignoring all the other sources and noise. This neural stream can then be compared to the identical pitch as seen in other critical bands to determine timbre, and between the two ears to determine azimuth (localization) of this source.
The neural network is capable of detecting the fundamental frequency of each source to high accuracy (1%) and then using that pitch to separate signals from each source into independent neural streams. These streams can then be separately analyzed for pitch, timbre, azimuth, and distance. One of the significant features of this model is the insight it brings to the problem of room acoustics. All the relevant information extracted by this mechanism is dependent on the phase relationships between upper harmonics. These phase relationships are scrambled in predicable ways by reflections and noise. In room acoustics most of the scrambling is done by reflections and reverberation. These come later than the direct sound (which contains the unscrabled harmonics. If the time delay is sufficent the brain can detect the properties of the direct sound before it is overwhelmed by early reflections and reverberation.
By studying this scrambling process through the properties of my neural model we can predict and measure the degree to which pitch, azimuth, timber, and distance of individual sources is preserved (or lost) in a particular seat in a particular hall. The model has been used to accurately predict the distance from the sound source that localization and engagement is lost in individual halls, using only binaural recordings of live music. This distance is often discouragingly close to the sound sources, leaving the people in most of the seats to cope with much older and less accurate mechanisms for appreciating the complexities of the music.
The network, along with the many experiences the author has had with the distance perception of "near" and "far" and its relationship to audience engagement, is contained in the three preprints written for the International Conference on Acoustics (ICA2010) in Sydney. The effects were demonstrated (to an unfortuantely small audience) at the ISRA conference in Melbourne, which concluded on August 31, 2010. The three preprints are available on the following links. "Phase Coherence as a measure of Acoustic Quality, part one: The Neural Network"
"Phase Coherence as a measure of Acoustic Quality, part two: Perceiving Engagement"
"Phase Coherence as a measure of Acoustic Quality, part three: Hall Design"
I believe this model is of high importance for both the study of hearing and speech, and for the practical problem of designing better concert halls and opera houses. The model - assuming it is correct - shows that the human auditory mechanism has evolved over millions of years for the purpose of extracting the maxumum amount of information from a sound field in the presence of non-vital interference of many kinds. The information most needed is the identification of the pitch, timber, localization, and distance of a possibly life threatening source of sound. It makes sense that most of this information is encoded in the sound waves that reach the ear in the harmonics of complex tones - not in the fundamentals. Most background noise is inversely proportional to frequency in its spectrum, and thus is much stronger at low frequencies than at high frequencies. But in addition, the harmonics - being at higher frequencies - contain more information about the pitch of the fundamentals than the fundamentals themselves, and are also easier to localize, since the interaural level differences at high frequencies are much higher than at low frequencies.
the proposed mechanism is capable of separating the pitch information from multiple sources into independent neural streams, allowing a person's consciousness to choose among them at will. This is the well known "cocktail party effect" where we can choose to listen to one out of four or more simultaneous conversations. The proposed mechanism solves this fundamental problem, as well as providing the pitch acuity of a trained musician. The way it detects pitch also explains many of the unexplained properties of our perception of music and harmony.
September 2 found me in Berlin working on a LARES system in the new home of the Deutches Staatsoper Berlin in the Schillertheater. This was a real learning experience! The Staatsoper wanted the 20+ year old LARES equipment to be re-installed in the new location, and with the help of Muller-BBM we succeeded. Barenboim was satisfied. I was aching to install our newest version of the LARES process, but final hardware is not ready. I was able to patch in a prototype late at night, and it performed very well indeed. I could raise the reverberation time of the hall well beyond two seconds at 1000Hz with no hint of coloration. The new hardware should be available soon, and I can't wait.
On September 8th I had the honor of a two hour lecture at Muller-BBM GMBH in Munich. The slides for this lecture contain updated Matlab code for LOC measure for localization. The code will open a Microsoft .wav file that contains a binaural impulse response. The file is assumed to be two-channel. The program then calculates LOC assuming that the sound source is on the left side. The code also plots a graphic of the number of nerve inpulses from the direct sound and the number of nerve impulses from the reverberation. The slides are available here: "Listening to Acoustics"
The slides presented in Sydney and Melbourne are mostly the same as the ones in the following link, which were presented to audiences in Boston, and Washington DC. These slides are more extensive than the ones for Sydney, as more time was available for the presentation. A relatively complete explanation of the author's equation for the degree of localization and engagement, along with Matlab code for calculating the parameter are included in the slides. "The Relationship Between Audience Engagement and Our Ability to Perceive the Pitch, Timbre, Azimuth and Envelopment of Multiple Sources" contains the slides from the talks given in Boston and Washington DC on the subject.
The purpose of the research into hearing described in these slides is not necessarily to discover the exact neurlogy of hearing, but to understand why and how so many of our most important sound perceptions are dependent on acoustics - particularly on the strength and time delay of early reflections. Early reflections are currently presumed to be beneficial in most cases, but my experience has shown that when there are excessive early reflections they strongly detract from both the psychological clarity and engagment of a sound. The slides are pretty hard-hitting. They can be read together with the slides from the previous talk below - from which they partially borrow.
Thanks to Professor Omoto and his students at Kyushu University for pointing out that the equation for LOC in "The importance of the direct to reverberant ratio in the perception of distance, localization, clarity, and envelopment" was nonsense. I had inverted the order of the integrals and the log function. Many thanks to Omoto-San. I corrected the slide, and added some explanations. I also added MATLAB code for calculating it, which might help in trying to understand it.
I have been recently writing some reviews and a brief note on reverberation for the Boston Music Intellegencer - www.classical-scene.com. You might want to check out the site. As part of the effort to understand how the ear hears music I am planning several presentations at the ICA conference in Sydney, and a listening demonstration seminar in Melbourne this summer.
A sort-of recent addition to this site is a brief note on the relationship beteen the perception of "running liveness" on the independent perception of direct sound. My most recent experiments (with several subjects) show that envelopment - and the sense of the hall - is greatly enhanced when it is possible to precisely localize instruments. This occurs even when the direct sound is 14dB or more lower in energy than the reflected or reverberant energy. The note was written as a response to a blog entry by Richard Talaske. The link to Talaske's blog is included in the note. "Direct sound, Engagement, and Running Reverberation"
My latest research has focused on the question of audience engagement with music and drama. I was tought to appreciate this psychological phenomenon by several people - including among others Peter Lockwood (assitant conductor in the Amsterdam Opera), Michael Schonewandt in Copenhagen, and the five major drama directors in Copenhagen that participated in an experiment with a live performance several years ago. The idea was given high priority after a talk by Asbjørn Krokstad at the IoA conferent in Oslo last September. Krokstat gave me the word engagement to describe the phenomenon, and spoke of its enormous importance (and current neglect) in the study and design of concert halls and operas.
His insight motivated the talk I gave in Brighton the following month, also courtesy of the IoA. But the work was far from done. If one thinks one has a vitally important perceptual phenomenon that most people are unaware of, it is essential to come up with a mathematical measure for it. I was not sure this was possible, but went to work anyway. Just before being scheduled to give another talk on this subject in Munich for the Audio Engineering Society I came up with a possible solution, a simple mathematical equation that predicts the threshold for detecting the localization (azimuth) of a sound source in a reverberant field. You plug in an impulse response - and it gives you an answer in dB. If the value is above zero, I can localize the sound. If it is below 0, I cannot. Whether this equation works for other people I do not know, but Ville Pulkke and his students may be interested in finding out. Although azimuth detection is NOT the same as engagement, I believe it is a vital first step. For me, engagement seems to correspond to a result of the azimuth detection equation returning a value above +3dB.
The talk - there is no paper yet - that describes this work (and a great deal besides) is here: "The importance of the direct to reverberant ratio in the perception of distance, localization, clarity, and envelopment" The talk here is the one I presented at the Portland meeting of the ASA, and is slightly different from the one in Munich. It contains audible examples that should work when clicked. The Munich talk is here: "The importance of the direct to reverberant ratio... Munich" I found some errors in the original presentation - this is correct as of 5/14/09
There is a preprint available from the AES that contains most of the talk - but does not include the new equation. "The importance of the direct to reverberant ratio in the perception of distance, localization, clarity, and envelopment"In January I traveled to Troy, NY to talk with Ning Xiang and his students about my current research. Part of the result was a renewed interest in publishing my work with the localization mechanisms used in binaural hearing, and due to these mechanisms the importance of doing measurements at the eardrum and not at the opening of the ear canal - or even worse, with a blocked ear canal. This talk was recently revised for presentation at the Audio Engineering Society Convention in Munich. The link here is to the version as of May 9, 2009. "Frequency response adaptation in binaural hearing"
As part of this effort I went back and scanned my 1990 paper on sound reproduction with binaural technology. I found the paper to be long, wordy, and far to full of ideas to be very useful. But for the most part, I think it is very interesting and correct. Where I disagree with it now I added some comments in red. I think it makes interesting reading if you can stick with it. "Binaural Techniques for Music Reproduction" This paper says a great many correct things about binaural hearing, and agrees closely with my current work, coming to the identical conclusions - namely that blocked ear canal measurements, and partially blocked ear canal measurements do not properly capture the response of headphones. You have to measure at the eardrum - or as I have recently developed - use loudness comparisons to find the response of headphones at the eardrum. The use of a frontal sound source as a reference is also recommended, both for headphone equalization and for dummy head equalization. I will get a copy of the AES journal paper on dummy head equalization on this site in the near future.
In November I traveled to the Tonmeister Tagung in Leipzig, followed by the IOA conference on sound reproduction in Brighton. In Leipzig I gave two lectures, one on the importance of direct sound in concert halls, and another on headphones and binaural hearing. The talk on concert halls was repeated in Brighton in a longer form as the Peter Barnett memorial lecture. I was able to demonstrate many of the perceptions with 5.1 recordings in Brighton, thanks to Mark Bailey. "The importance of the direct to reverberant ratio in the perception of distance, localization, clarity, and envelopment"
The concert hall paper had an important message: That current research on concert halls has tested the case where the direct sound is stronger than the sum of all the reflected energy. This domain is appropriate to recordings - but is inappropriate to halls, where the energy in the direct sound is far less than the energy in the reverberation in most of the seats. When laboratatory tests are conducted using realistic levels of direct sound, very different results emerge.
We find that in this case when the brain can separately detect the direct sound in the presence of the reverberation the music or the drama is enhanced, and audience involvment is maximized. But there must be a time gap between when the direct sound arrives at a listener and the onset of the reverberation if this detection is to take place. In a large hall this gap generally exists, and clarity and involvement can occur over a range of seats. In a small hall the gap is reduced, and the average seating distance must be closer to the sound source if clarity is to be maintained.
Conclusion: Don't build shoebox halls for sizes under 1000 seats!
This paper presentation included some wonderful 5.1 demos of the sound of halls and small halls with different D/R ratios and time gaps. I will include these here - but not yet. Stay tuned.
The paper on headphones goes into detail about the mechanisms the brain uses for detecting timbre and localization. It includes some interesting data on the unreliability of blocked ear canal measurements of HRTFs. Highly recommended to anyone who wants to know why head-tracking is assumed to be necessary for binaural reproducition over headphones. An obvious conclusion: if head-tracking is necessary, you KNOW the timbre is seriously in error. "Frequency response adaptation in binaural hearing"
A Finish student requested that I add the lecture slides for a talk I gave in October 2008 to the Finish section of the Audio Engineering Society. The subject was "Recording the Verdi Requiem in Surround and High Definition Video." These slides can be found with this link. However I notice that I had previously put most of these slides on this site - together with some of the audio examples you can click to hear. This version can be found below in the link to the talks give to the Tonmeister conference in 2006.
Last September I gave a paper at the ICA2007 in Madrid, which presented data from a series of experiments that I had hoped would lend some clarity to the question of why some concert halls sound quite different from others, in spite of having very similar measured values of RT, EDT, C80 and the like. A related question was how can we detect the azimuth of different sound sources when the direct sound - which must carry the azimuth information - is such a small percentage of the total sound energy in most seats?
The experiments first calculated the direct/reverberant ratio for different seats in a model hall (with sanity checks from real halls) and then looked for the thresholds for azimuth detection as a function of d/r. Some very interesting results emerged. It seems that the delay time between the arrival of the direct sound and the time that the energy in the early reflections had built up to a level sufficient to mask the direct sound was absolutely critical to our ability to obtain both azimuth and distance information. In retrospect this point is obvious. Of course the brain needs time to separately perceive the direct sound and the information it contains - else this information cannot be perceived.
The necessity of this time delay has consequences. It means that as the size of a hall decreases the direct to reverberant ratio over all the seats must increase if sounds are to be correctly localized, and the clarity that comes with this localization is to be maintained. To achieve this goal, it is desirable in small halls that the audience sits on average closer to the musicians, and the volume needed to provide a longer reverberation time be obtained through a high ceiling, and not through a long, rectangular, hall. Halls less than 1500 seats should probably look more like opera houses with unusually high ceilings. They should not be shoebox in shape. Fortunately, Boston has such a hall: Jordan Hall at New England Conservatory (1200 seats) is the best hall I know of worldwide for chamber music or small orchestral concerts. It has a near opera house shape with a single high balcony, and a very high ceiling. The sound is clear and precise in nearly every seat, with a wonderful sense of surrounding reverberaton. Musicians love it, as does the local audience.
Alas, the paper went over like a lead balloon. It was way too dry, and it seemed no one understood how it could possibly be relevant to hall design. So I quickly re-did the paper I was to present in Seville the next week to answer the many questions I received after the one in Madrid. The powerpoints for this presentation are here: "Why do concert halls sound different – and how can we design them to sound better?" Hopefully this gets the ideas across better.
At the request of a reader I have put the sound files associated with the lecture at the RADIS conference which followed ICA2004. "The sound files for RADIS2004" These include several (pinna-less) binaural recordings from opera houses, and both music and speech convolved with impulse responses published by Beranek. The lecture slides can be found with the following link: Slides for RADS2004
David Griesinger is a physicist who works in the field of sound and music. Starting in his undergraduate years at Harvard he worked as a recording engineer, through which he learned of the tremendous importance of room acoustics in recording technique. After finishing his PhD in Physics (the Mösbauer Effect in Zinc 67) he developed one of the first digital reverberation devices. The product eventually became the Lexicon 224 reverberator. Since then David has been the principle scientist at Lexicon, and is chiefly responsible for the algorithm design that goes into their reverberation and surround sound products. He also has conducted research into the perception and measurement of the acoustical properties of concert halls and opera houses, and is the designer of the LARES reverberation enhancement system.
The purpose of this site is to share some of my publications and lectures. Most of the material on the site was written under great time pressure. The papers were intended as preprints for an aural presentation. Some of them are available as published preprints from the Audio Engineering Society. The papers should be considered drafts - they have not been peer reviewed.
Several power point presentations have also been added to the site. These presentations are often quite readable and informative. In general they are more coherent than the preprint of the same material, although they are naturally not as detailed. They can be read in conjunction with the preprint of a talk. These are available as .pdf downloads, although your browser may allow you to open them with Adobe Acrobat.
Some recent work includes:
"Perception of Concert Hall Acoustics in seats where the reflected energy is stronger than the direct energy - or… Why do Concert Halls sound different - and how can we design them to sound better? - A poster paper presented at the AES conference in Vienna, May 7, 2007. "Two powerpoints given at the Tonmeistertagung Nov. 2006, and the AES convention October 2006" The subjects are "Distance Effect and Muddiness" and "Recording the Verdi Requiem in Surround Sound and High-Definition Video" These lectures are presented in a .zip file so the audio examples can be included. It is recommended you unzip the package in a single directory, and then view the slides with powerpoint. The audio examles should work when clicked.
Recent work on headphone calibration [October 2008] - done partly with the students of Ville Pulkki at Helsinki University - has shown that all the types of headphones we tested have large variations in frequency response for different individuals AND that these differences make for very different perceptions of the sound quality of binaural recordings from halls. It is possible to compensate for these errors by matching headphones to listeners through noise band loudness matching. Thus before listening to the music examples in the above papers using headphones or typical computer speakers, please read the following: "The necessity of headphone equalization" This link has been substantially re-written. Simple loudness matching is not sufficient. A reference loudspeaker is required - and loudness differences must be compensated.
"Pitch Coherence as a Measure of Apparent Distance and Sound Quality in Performance Spaces" This is a preprint of a poster paper presented at the Institut of Acoustics conference in Copenhagen, May 2006. The preprint contains some audio examples, that can be heard by clicking on the links.
An effort to clean house resulted in scanning a few old papers: "Reducing Distortion in Analog Tape Recorders" JAES March 1975, "The Mossbauer Effect in Zn67" Phys Rev B vol 15 #7 p 3291 April 1 1977, and "Spaciousness and Localization in Listening Rooms - How to make Coincident Recordings Sound as Spacious as Spaced Microphone Arrays" JAES v34 #4 p255-268 April, 1986. I still find the paper on spaciousness to be interesting and insightful. The observations continue to be relevant to my work, particularly on the subject of the intereaction between loudspeakers and rooms at low frequencies. It took many years before I was able to explain the observed effects. For example, Dutton's work on the apparent localization errors with conventional stereo loudspeakers at high frequencies are fully explained in "Stereo and Surround Panning in Practice" The effects of low frequency room modes on spatial properties are fully explained in “Loudspeaker and listener positions for optimal low-frequency spatial reproduction in listening rooms”
The early paper on spaciousness and localization is much too optomistic about the possiblity of increasing the spaciousness of a listening room through increasing the low frequency separation. In most rooms where the low frequency modes do not correctly overlap the low frequency separation is inaudible. Increasing it only stresses the loudspeakers. The best solution is to drastically change the loudspeaker positions, or change the room dimensions.
New in October of 2005 are the slides from a lecture on recording technique,
given in Japan for the Audio Engineering Society, and in Schloss Hohenkammer for
the Tonmeister conference in October. I have also finally added "Griesinger's Coincident Microphone Primer", a paper
from 1987 that describes and mathematically analyzes much of the behavior of concident microphone arrays,
including the Soundfield microphone. New in May of 2005 is a paper on room dimensions and loudspeaker placement
for the reproduction of envelopment at low frequencies, presented at the
acoustical society meeting in There are more papers listed at the bottom of
the page... don't give up here! Bibliography (somewhat out of date as of 5/05): "Griesinger's Coincident Microphone
Primer" Oct. 1985 - available from the Author. "Spaciousness and Localization in Listening
Rooms and their Effects on the Recording Technique" JAES v34 #4 p255-268
April, 1986 "New Perspectives on Coincident Microphone
Arrays" presented at the 82nd convention of the AES preprint 2464 "Neue
Perspectiven fur koinzidente und quasikoinzidente Verfahren" Bericht 14.
Tonmeistertagung Munchen 1988 "Verbesserung
der Lautsprecherkompatibilitat von Kunstkopfaufnahmen durch herkommliche und
raumliche Entzerrung" Bericht 15. Tonmeistertagung Mainz 1988 "Practical Processors and Programs for Digital
Reverberation" proceedings of the AES 7th International Conference, Equalization and Spatial Equalization of Dummy-Head
Recordings for Loudspeaker Reproduction" JAES 37 #1/2 1989 p20-28 "Theory and Design of a Digital Audio
Processor for Home Use" ibid. p 20-29 "Binaural Techniques for Music
reproduction" Proceedings of the 8th international conference of the AES
1990 p 197-207 "Study of Acoustical Enhancement Systems,
leading to the use of time variant synthetic reverberation" ASA meeting PA
Mar. 1990 "Improving Room Acoustics through time variant
synthetic reverberation" AES convention "Room Impression Reverberance and Warmth in
Rooms and Halls" presented at the 93rd AES convention in "Measures of Spatial Impression and
Reverberance based on the Physiology of Human Hearing" Proceedings of the
11th International AES Conference May 1992 p114-145 "IALF - Binaural Measures of Spatial
Impression and Running Reverberance" Presented at the 92nd convention of
the AES March 1992, preprint #3292 "Analysis of Room Impulse Responses based on
Perception" 5/14/93 - available from the author "Quantifying Musical Acoustics through
Audibility" Knudsen Memorial Lecture, Denver ASA meeting, Nov. 1993 "Subjective Loudness of Running Reverberation in
Halls and Stages" Proceedings of the Sabine Memorial Conference MIT, June
1994 - available from the Acoustical Society of America "Progress in Electronically Variable
Acoustics" ibid. "Reverberation Level Matching
Experiments" W. Gardner and D Griesinger ibid. "Wie Laut ist mein Nachhall?" -
proceedings of the Tonmeister Convention, "Further Investigation into the Loudness of
Running Reverberation" proceedings of the "Optimum reverberant level in halls"
International Conference on Acoustics, "Feedback reduction and acoustic enhancement
in a cost effective digital sound processor" International Conference on
Acoustics, "Design and performance of multichannel time
variant reverberation enhancement systems" The proceedings of the 1995
International Symposium on Active Control of Sound and Vibration, Patents: "Sound Reproduction" - A directionality
enhancement system for converting encoded stereo signals into four output
channels #4,862,502 7/29/1989 "Sound Reproduction" - A directionality
enhancement system for converting encoded stereo signals into 6 or 7 output
channels #5,136,650 1992 "Electroacoustic System" - A system of
microphones and loudspeakers in conjunction with computer based electronics for
altering and improving the acoustics of spaces #5,109,419 4/28/1990 "A Spatial Impression Meter" 1993 David Griesinger is a physicist interested in sound
- the sound of music. He is particularly interested in translating subjective
impressions of sounds into the physics of sound propagation, and the
psychoacoustics of sound perception. He has found that although it is wonderful
to discover ways to improve the quality of a reproduced sound, it is far more
useful and powerful to understand exactly how the improvement was achieved. This interest started with work as a recording
engineer. Through college and graduate school I recorded concerts and made records
for student organizations. The need for better microphones led to work in
microphone design and construction, Starting in 1964 with the construction of
omnidirectional condenser microphones. In about 1985 I designed and constructed
a miniature Soundfield microphone (16mm diameter), and in about 1990 made a
dummy head microphone for classical recordings. Most of the work on microphones
has not been described in publications, but a paper did appear in the Journal
of the Audio Engineering Society on the equalization of dummy head microphones.
This paper is also available in German from the Deutsche Tonmeister Verband. "Verbesserung der
Lautsprechercompatibilität von Kunstkopfaufnahmen durch herkömliche und
räumliche Entzerrung" Bericht der 15. Tonmeistertagung, Early work in this field produced a paper in the
Audio Engineering Society journal on distortion reduction in magnetic tape
recorders, and a paper on image localization (as a function of frequency) from
two channel sound equipment in small rooms. Griesinger, D. "Spaciousness
and Localization in Listening Rooms - How to make Coincident
Recordings Sound as Spacious as Spaced Microphone Arrays" JAES v34
#4 p255-268 April, 1986 This paper is still interesting to me, although it
took more than 20 years for me to develop the knowledge and techniques to
predict the results from first principles. The work as a recording engineer also led to an
abiding interest in artificial reverberation, and this eventually resulted in
the development of the Lexicon digital reverberation devices. Alas, due to
problems with trade secrets this work remains unpublished. About in 1990 I started installing reverberation
units in spaces used for musical performances, in an effort to improve the acoustics
for live performances. This work led eventually to the development of the LARES
system for acoustic enhancement. This work is described in the paper "Improving Halls and Rooms with Multiple Time Variant
Reverberation" which is on this site. Unfortunately this paper is not
yet available to me with machine readable drawings, and is presented here
without them. Perhaps eventually we will have the complete paper. I still
consider this paper a classic - although the precise method of randomizing the
reverberation devices is deliberately not described (sorry... you have to draw
the line somewhere.) LARES works wonderfully well - but I learned quite
quickly that conventional acoustical measurement techniques were useless for describing
its performance. The glaring mismatch between what you could easily hear in a
hall and the measurements one could make resulted in a serious study into the
perception of acoustics. A flurry of papers resulted - all more or less wrong. I also did considerable work on technical methods
of room measurement. At least two interesting papers resulted - see the 1992
paper " Impulse response measurements using
All-Pass deconvolution and the later paper on occupied hall measurement. Beyond MLS - Occupied Hall Measurement With FFT Techniques I
am actually quite proud of both papers. The all pass deconvolution method is
amazingly clever and efficient. You simply play this strange time-reversed
signal into the room, and play the result through a simple reverberator.
Instant impulses result - quite amazing. The sweep method is actually much more
effective, but far less clever. Conventional measures were clearly missing the
point - but for a long time, so was I. About ten years ago this work started to
converge into a coherent (at least I think it is coherent) hypothesis about how
we perceive the acoustics of enclosed spaces. Griesinger, D. It turns out that acoustic perception relies on two
very different phenomena. The most basic is the detection of reflected energy
by the hearing system. This detection relies on fluctuations in the Interaural
Time Delay (ITD) and the Interaural Intensity Difference (IID). The
fluctuations are caused by interference between the direct sound from a source,
and delayed reflected sound. The creation of fluctuations is a physical process
- it can be easily modeled and predicted. The other piece of the puzzle takes
place much later in the neural process, and is related to the process of
separating incoming sound events into related streams of information, such as
the syllables of speech from a single person. It turns out there is neurology
for this separation process. This neurology organizes sound events into one or
more foreground streams. But there is also neurology that keeps track of the
loudness and the sound direction of background sound in the spaces between
sound events. Our perception of the background also forms a stream - but this
one is perceived as continuous, and has specific spatial properties. The
neurology associated with the background stream is the primary source of our
perception of musical envelopment, and so the spaces between musical notes are
vital to this perception. The separation of the background stream from the
foreground stream takes time. Reflected energy that arrives too soon after the
end of a note is perceived as part of the note itself, and does not contribute
to envelopment. It is only after 100ms or more that reflected energy really is
heard as background reverberation, and understanding this time delay is vital
to understanding how halls and operas are perceived with music. The whole
hypothesis is best described in the July 1997 article in ACTA Acustica. The
same material is contained in a somewhat longer preprint for JAES. "Spaciousness and envelopment in musical acoustics."
The JAES preprint also includes a section on how the hypothesis applies to the
practical improvement of halls and operas. This part has not yet appeared in
Acustica. The concept of interaural fluctuations has been used
to solve a very old riddle - the riddle of how many independent bass drivers
one needs in a sound system in a small room, and where should these drivers be
put. To make a long story short: you need at least two low frequency drivers,
and ideally they should be at either side of the listeners. This work is
described in the papers on small room acoustics. The latest paper on this
subject is the one
presented in Vancouver in 2005: “Loudspeaker and listener
positions for optimal low-frequency spatial reproduction in listening rooms”
This paper is highly recommended. Others include:
Much of the work described in the above paper was
done using the MATLAB language. Hardcore researchers might be interested in the
Code that was used. This is available with NO instructions, in the Following
file. Please email the author if you wish to use this code. For this purpose,
use the email address in the picture. The site also includes a recent paper on
reverberation enhancement. Be sure to check out the lecture slides for this
paper - they are much more interesting.: The next item is the lecture notes for a workshop at
the September 1999 Audio Engineering Convention. In this workshop I had about
two hours to cover the essentials of recording technique for surround sound. It
was a lot of fun - but a great deal of what was said is not in the notes. I
believe the AES made a cassette recording. This might be worthwhile. The following lecture slides were presented at the
meeting of the Acoustical Society in The AES conference in October 2000 was fun, but the
slides were prepared in more than the usual rush. Basically nothing new here,
particularly in the first one. Diehard fans might get something out of the
second, but the The Tonmeister conference in Alas, in most cases with large forces the actual
level of the "support" microphones in the final mix is larger than
the level of the "main" microphone, so in practice the roles are
reversed. Nothing intrinsically wrong with this confusion - but it leads to
some rather bizarre recommendations, such as delaying the output of the
"support" microphones so the time of arrival of the wavefront comes
after the signal from the "main" microphone. The remarkable thing is
that adding such a delay does not sound as strange as one might expect. But in
my experience it always sounds worse than no delay at all. Once again the
lecture slides may give the better picture, but you may want to look at both
the preprint and the slides. "Perceptual Modeling" was a term invented
by one of our advertising agents to describe the design of the reverberation
controls in the Lexicon 960. I don't think it means anything at all, which is
good for marketing. But the above paper is quite a useful description of how to
use reverberation to control the apparent distance of a sound source. We have
been doing this with our products for years of course, and the process is well
described in the Lexicon 480 manual with the "ambience" algorithm.
However, outside the manuals I made no real attempt to publish the concepts,
leading to some rather interesting claims by others of having discovered it
all. An interesting issue came up at this Tonmeister
conference. Gunther Theile played a tape made by some of his students, where
they compared the hall pick-up from four omni directional microphones spaced in
a square array at different distances. Unfortunately I was unable to understand
exactly the conditions of the experiment, but the closest set of microphones
used a spacing of ~25cm. In a quick listening test in the listening room at the
show, with about 50 people present, the closest spacing seemed to be preferred
generally over the wider spacings. The result seems to contradict an assumption that I
make in nearly all the work I have done - that uncorrelated reverberation
sounds better than correlated reverberation. The reasons for this result are
unclear. The suggestion offered at the time - that the closer spacing allowed
better imaging of the sides of the room - seems unlikely, among other things
for the fact that side imaging does not exist for a forward facing listener. In
an effort to resolve this issue - which I take to be of the highest importance
- I wrote a note to Eberhard Sengpiel. The note is included here for those who
think the issue is as important as I do. The next series of references are to slides for the
ICA 2001 conference. These references are from the conference of the
Audio Engineering Society in The next paper, on stereo and surround panning in
practice, is pretty good, I think. I wrote it because we were having difficulty
preserving the apparent horizontal direction (azimuth) of sound sources in a
two channel stereo image when we converted the two channel to 5 or 7 channels
with the Logic 7 algorithm. This is interesting because L7 was designed
assuming the standard sine/cosine pan law to be correct. We detected the
left/right balance of a front sound source, and used the sine/cosine law to
find the azimuth. We then adjusted the balance in the three front channels to
present it with the same azimuth. Alas, this does not work. We traced the
problem to the two channel sine/cosine pan law, which is seriously wrong for
most musical sources. (Curiously, the three channel version - that is panning
from a center speaker to either left or right - works quite well.) The reason
for all this is to be found in binaural theory. Turns out in two channel
panning the perceived azimuth is highly frequency dependent, with frequencies
above 1000Hz sounding much wider than the sine/cosine law would predict. Suitable
averaging over frequency gets the right answer.
For some reason this paper has remained undeservedly obscure. Another lecture on surround for the Tonmeisters. I
think both the message and the slides get better the more I do it. And now for something completely different... Being
currently over 60, and having in my youth studied information theory, I have a
low tolerance for claims that "high definition" recording is anything
but a marketing gimmick. I keep, like the Great Randi, trying to find a way to
prove it. Well, I got the idea that maybe some of the presumably positive
results on the audibility of frequencies above 18000Hz were due to
intermodulation distortion, that would covert energy in the ultrasonic range
into sonic frequencies. So I started measuring loudspeakers for distortion of
different types - and looking at the HF content of current disks. The result is
the paper below, which is a HOOT! Anytime you want a good laugh, take a read. Surround from stereo is my most complete
explaination of Logic 7 and its workings. Worth checking it out Slides from the
AES conference October 2003. Subject is converting stereo signals into
surround. And finally we get to something REALLY new. I had
been working for some time on ways of measuring hall acoustic properties from
binaurally recorded speech. It turns out to be pretty simple to learn a lot
about LATERAL reflections from a running IACC. But medial reflections are
trivial to hear in speech and music (at least when they approach the energy of
the direct sound) and the detection process (whatever it is) is very robust. I
decided to submit a preprint without knowing how to solve this problem,
figuring the pressure of the due date might make some progress. Sure enough,
the due date came around, with no solution. Two weeks of very hard work... and
I had an answer. Turns out, we detect medial reflections through their effect
on the audibility of pitch! This ability
(on signals from a single source with a defined pitch) is, I believe, the
primary distance cue. Ultimately I
believe the methods shown here will lead to a new (and quite useful) measure
for sound quality of rooms. The paper for the Be sure to check out the method of deriving
listenable sound examples from Leo Bernaek's published echograms! I am deeply honored that Leo Beranek chose me to share
his lecture to the Acoustical Society, in honor of it's 50th birthday, and
Leo's 90th. Spurred on by the thought that I could easily make an ass of
myself, I put together a pretty good lecture. Highly recommended. Bill Martins put together a little dog and pony
show about low frequency spatial reproduction in small rooms. In honor of this
I made this paper on how to determine the optimal room dimensions and speaker
placement for spatial reproduction at low frequencies. You can do it in an hour
on the back of an envelope if you know the room dimensions. Turns out square
rooms are pretty much impossible. If you have one, tear it down!
Progress in 5-2-5 Matrix Systems
Speaker placement, externalization, and envelopment in home
listening rooms
General overview of spatial impression, envelopment,
localization, and externalization
The .zip compressed Matlab code for experiments with
DFT and externalization. Requires a Working MATLAB C compiler to be practical.
Recent experiences with electronic acoustic enhancement in
concert halls and opera houses
Lecture notes from the September 27, 1999 AES workshop
Recent experiences with electronic acoustical
enhancement in concert halls, opera houses, and outdoor venues - the lecture
slides without pictures.