Progress in 5-2-5 Matrix Systems

David Griesinger

Lexicon

3 Oak Park

Bedford MA 01730

Abstract

A high quality 5-2-5 matrix encoder and decoder system offers the prospect of inexpensive compatible media for multichannel sound. The advantages to the consumer, music and film producers, and broadcasters, are obvious. This paper reports on a system which offers excellent 5-2-5 codec performance, while preserving or improving the balance, frontal perspective, and spaciousness of standard stereo recordings. The decoder provides two or four independent rear outputs, which are capable of complete separation from the other outputs for a single steered sound effect, and which preserve full left/right separation during music. Decorrelated signals such as music can be panned forward and back with full left/right separation. Frontal perspective and the balance between center material such as dialog and vocals and other material is preserved through careful control of the center channel level as a function of the center content of the input signal. This paper will present a mathematical description of the matrix elements of the new decoder, and discuss some of the psychoacoustic data on which it is based.

Introduction

Although initially developed for multichannel music reproduction, matrix systems have been relegated to film sound. They are capable of much more. A preliminary design for a new matrix topology has been tested by the IRT in Munich as a 5-2-5 codec, with excellent results on a wide range of broadcast material. Although there were audible differences, the differences were perceived as small changes in localization – and were sometimes preferred to the original. We have extensively tested the new matrix with ordinary stereo music material. In almost every case the multichannel matrix reproduction of the material is preferable to a two-channel presentation. This is a wonderful way to hear new sounds from your favorite recordings, and amazing sounds from recordings which have been remixed for 5.1 channels. A high quality 5-2-5 matrix offers a Rosetta stone for audio reproduction. A single inexpensive circuit can play both encoded and unencoded music, films, and broadcast material. The advantage to the consumer is obvious – high quality multichannel recordings available on compatible CD’s, cassettes, videotapes, etc. The recordings can be played anywhere the consumer has a player, and yet on a multichannel system true multichannel audio results. Who wouldn’t want to hear multichannel broadcasts in an automobile?

Why do we need more than two loudspeakers? Research into the spatial acoustics of small rooms shows that reproduction of stereo music through two speakers is not an optimal solution, even when the listener is ideally situated. Additional loudspeakers, driven with signals that provide audible spatial components, can significantly increase the pleasantness of the sound field, and enlarge the listening area. This is particularly true when speakers are placed along the sides of the room.

The primary improvement is in the perception of spaciousness. To see how multiple speakers can help, we can look at the perceptual origin of spaciousness. Spaciousness in concert halls and in small rooms is primarily determined by the spatial diffusion of sounds that arrive at least 160ms after the ends of strong foreground sound events. It is the spatial properties of the background sounds between notes, which determines how much we feel involved with and enveloped by music.

In a small room the spatial properties of low frequencies are primarily determined by the ratio of the lateral room modes to the back/front and the vertical modes. It is the interference between these two modal types that determines how spacious the sound is. The ratio of the two modal types is strongly influenced by the location and spacing of the loudspeakers in the room, and in how the recording was made. At higher frequencies the spatial properties are determined by the frequency of interest and the spacing between the front loudspeakers. At some frequencies even a relatively close spacing can produce substantial spaciousness, which is why ordinary stereo works at all.

However at all frequencies it is primarily the REVERBERANT portion of the recorded sound which is gives rise to the sensation of spaciousness. Thus it is possible to make a recording in which most of the instruments are pan-potted to positions near the center, and still have it sound spacious after stereo reverberation is added. In popular music such recordings are perhaps the rule rather than the exception. It is easy to show that two ordinary stereo loudspeakers in a small room cannot reproduce the spatial diffusion of a large hall or concert space. However, if the reverberant portion of the stereo signal can be reproduced through an array of loudspeakers at the sides of the listener(s) a far more satisfactory diffuse field can be created. Multiple speakers, if they are driven with independent decorrelated signals, can create a diffuse sound field both at low frequencies and at high frequencies.

A multichannel matrix audio system has two goals. First, the system should duplicate as best as possible the sound balances and localization created by the sound mixer and the producer. This should be true both for recordings that were mixed for multichannel surround and for normal stereo recordings. Second, both with five-channel material and with ordinary two-channel material the playback system should maximize the spatial diffusion of the background sound field. (A system which was compatible with the existing Pro-Logic standard would also be plus.)

The first requirement, that localization and balance should be preserved, is decidedly tricky. For example, stereo recordings are routinely mixed for reproduction without a center loudspeaker. The sound mixer mixes the center channel information equally into the left and right channels of the recording. When we reproduce such a recording over a matrix system that includes a center speaker, we want exactly the same balance as the original, but we want a substantial amount of the sound power to come from the center loudspeaker. To achieve this goal we must drive the center speaker with a signal derived from the sum of the left and right input signals. However this sum contains not only the original center channel information, but the left and right stereo material as well. Reproducing the sum through the center loudspeaker must inevitably cause instruments located to the left and the right of the stereo image to move toward the center. The result is a loss both of spatial information and of the attractiveness of the mix. It is a goal of our matrix system to minimize this reduction in the width of the front image.

The goal of maintaining balance is also very important. It is not always possible to reproduce the original localization of a sound. However it is possible to preserve the loudness relationships between different sounds, and this must be done if the matrix system is to be compatible with two-channel stereo. We will explore these issues in detail in this paper.

The second requirement – that the reverberant component of the input signals should be reproduced with maximum spatial diffusion – can be achieved by rigorously maximizing the lateral decorrelation of the various outputs. This maximum lateral separation has been a design requirement of the Logic7 system from the beginning. The mathematics to achieve it has been steadily improving. This issue will also be explored in detail.

Matrix decoders in equations and graphics

In an AES paper in October of 1996 we presented the design of a matrix decoder that can be described by the elements of a two by n matrix, where n is the number of output channels. Each output can be seen as a linear combination of the two inputs, where the coefficients of the linear combination are given by the elements in the matrix. In this paper the elements are identified by a simple combination of letters. The previous paper described a five-channel and a seven-channel decoder. This paper will describe a five-channel decoder only.

It is obvious from symmetry that we need to describe the behavior of only six elements – the center elements, the two left front elements, and the two left side elements. The right elements can found from the left by simply switching the identity of left and right.

CL: The matrix element for the Left input channel to the Center output

CR: The matrix element for the Right input channel to the Center output

LFL: The Left input channel to the Left Front output

LFR: The Right input channel to the Left Front output

LRL: The Left input channel to the Left Side output

LRR: The Right input channel to the Left Side output

These elements are not constant. Their value varies as a two dimensional function of the apparent direction of the input sounds. All phase/amplitude decoders determine the apparent direction of the input by comparing the ratio of the amplitudes of the input signals. For example, the degree of steering in the right/left direction is determined from the ratio of the left input channel amplitude to the right input channel amplitude. In a similar way, the degree of steering in the front/back direction is determined from the ratio of the amplitudes of the sum and the difference of the input channels. We will not discuss the method for determining these steering directions in this paper, although Logic7 differs from standard decoders significantly in how this is done. These issues will be covered in another paper. We assume that the steering directions have been determined. In this paper we will represent these directions as angles – one angle for the left/right direction (lr), and one for the front/back (center/surround) direction (cs). The two steering directions are signed variables. When both lr and cs are zero the input signals are unsteered – that is, the two input channels are uncorrelated.

When the input consists of a single signal which has been directionally encoded the two steering directions have their maximum value. However under these conditions they are not independent. The advantage to representing the steering values as angles is that when there is only a single signal the absolute value of the two steering values must sum to 45 degrees. When the input includes some decorrelated material along with a strongly steered signal, the sum of the absolute values of the steering values must be LESS THAN 45 degrees.

|lr| + |cs| <= 45 degrees.

If we plot the values of the matrix elements over a two-dimensional plane formed by the steering values, the center of the plane will have the value 0,0 and the legal values for the sum of the steering values will not exceed 45. In practice, due to the behavior of the non-linear filters it is possible for the sum to exceed 45 – but we try to minimize this overrun. The mathematics presented here for the matrix elements is well behaved during overruns. However, when we graph the matrix elements we arbitrarily zero the values when the legal sum of the input variables is exceeded. This allows us to directly view the behavior of the element along the boundary trajectory – the trajectory followed by a strongly steered signal. The graphics were created by Matlab – which insists on labeling the axis incorrectly. In the Matlab axis the unsteered position is 45,45 (actually 46,46). Hopefully this will not be overly confusing.

Previous designs for matrix decoders tend to consider only the behavior of the matrix to a strongly steered signal – that is the behavior of the matrix elements around the boundary of our surfaces. This is a fundamental error in outlook. When you study real signals – either film or music – you find that the boundary of the surface is very seldom reached. For the most part signals wobble around the middle of the plane – slightly forward of the center. The behavior of the matrix under these conditions is of vital importance to the sound. When you compare our elements to previous elements one can see a striking increase in the complexity of the surface in the middle regions. It is this complexity which is responsible for the improvement in the sound.

Such complexity has a price. Our original (1987 – see the 1989 patent) design was simple to implement with analog components. The new elements are designed to be almost entirely described by one-dimensional lookup tables, which are trivial in a digital implementation. Designing an analog version with similar performance will not be easy.

In this paper we contrast the several different versions of the matrix elements. The earliest are elements from our 1989 patent. These elements were used in our first surround processor, and are identical to the elements of a standard surround processor in the left, center, and right channels. In our design the surround channel is treated symmetrically to the center channel. In the standard (Dolby) decoder the surround channel is treated differently, and this issue will be discussed at length later in the paper.

The elements presented here are not always correctly scaled. In general they are presented so the unsteered value of the of the non-zero matrix elements are for any given channel is one. In practice the elements are usually scaled so the maximum value of each element is one or less. In any case, in a final product the scaling of the elements is additionally varied in the calibration procedure. The matrix elements presented here should be assumed be scalable by appropriate constants.

The left front matrix elements in our ’89 patent

Assume cs and lr are the steering directions in degrees in the center/surround and left/right axis respectively.

In the ’89 patent the equations for the front matrix elements are given as:

In the left front quadrant

LFL = 1 – 0.5*G(cs) +.41*G(lr)

LFR = -0.5*G(cs)

In the right front quadrant

LFL = 1-0.5*G(cs)

LFR = -0.5*G(cs)

In the left rear quadrant

LFL = 1-0.5*G(-cs)+.41*G(lr)

LFR = +.5*G(-cs)

In the right rear quadrant

LFL = 1-0.5*G(-cs)

The function G(x) is described in the 89 patent, and specified in the ’91 patent. It varies from 0 to one as x varies from 0 to 45 degrees. When steering is in the left front quadrant (lr and cs are both positive) G(x) can be shown to be equal to 1-|r|/|l| where |r| and |l| are the right and left input amplitudes. See Figure 1 and Figure 2.

In the recent AES paper these elements were improved by adding the requirement that the loudness of unsteered material should be constant regardless of the direction of the steering. Mathematically this means that the root mean square sum of the lfl and lfr matrix elements should be a constant. It was pointed out in the paper that this goal should be relaxed in the direction of the steering – that is, when the steering is full left, the sum of the squares of the matrix elements should rise by 3dB. Figure 3 shows that the above matrix elements do not meet this requirement.

The 1996 AES paper corrected the amplitude errors in figure 3 by replacing the function G(x) in the matrix equations with sines and cosines: See Figure 4.

For the left front quadrant

LFL = cos(cs) + .41*G(lr)

LFR = -sin(cs)

For the right front quadrant

LFL = cos(cs)

LFR = -sin(cs)

For the left rear quadrant

LFL = cos(-cs) + .41*G(lr)

LFR = sin(-cs)

For the right rear quadrant

LFL = cos(-cs)

LFR = sin(-cs)

Improvements to the left front matrix elements

In March of 1996 we made several changes to these matrix elements. We kept the basic functional dependence, but added an additional boost along the cs axis in the front, and added a cut along the cs axis in the rear. The reason for the boost was to improve the performance with stereo music that was panned forward. The purpose of the cut in the rear was to increase the separation between the front channels and the rear channels when stereo music was panned to the rear.

For the front left quadrant

LFL = (cos(cs) + 0.41*G(lr))*boost1(cs)

LFR = (-sin(cs))*boost1(cs)

For the right front quadrant

LFL = (cos(cs) )*boost1(cs)

LFR = (-sin(cs))*boost1(cs)

For the left rear quadrant

LFL = (cos(-cs) + 0.41*G(lr))/boost(cs)

LFR = (sin(cs))/boost(cs)

For the right rear quadrant

LFL = (cos(cs))/boost(cs)

LFR = (sin(cs))/boost(cs)

The function G(x) is the same as the one in the ’89 patent. When expressed with angles as an input, it can be shown to be equal to G(x) = 1-tan(45-x).

The function boost1(cs) as used in March 1997 was a linear boost of 3dB total applied over the first 22.5 degrees of steering, decreasing back to 0dB in the next 22.5 degrees. Boost(cs) is given by corr(x) in the Matlab code below.

% calculate a boost function of +3dB at 22.5 degrees

% corr(x) goes up 3dB and stays up. corr1(x) goes up then down again

for x = 1:24; % x has values of 1 to 24

corr(x) = 10^(3*(x-1)/(23*20)); % go up 3dB over this range

corr1(x) = corr(x);

end

for x = 25:46 % go back down for corr1 over this range

corr(x) = 1.41;

corr1(x) = corr(48-x);

end

See Figure 5.

The performance of the March circuit can be improved. The first problem is in the behavior of the steering along the boundaries between left and center, and between right and center. As a strong single signal pans from the left to the center, it can be seen in figure 5 that the value of the lfl matrix element increases to a maximum half-way between left and center. This increase in value is an unintended consequence of the deliberate increase in level for the left and right main outputs as a center signal is added to stereo music.

When a stereo signal is panned forward it is desirable that the left and right front outputs should rise in level to compensate for the removal by the matrix of the correlated component from these outputs. However the method used to increase level under these conditions should only occur when the lr component of the inputs is minimal – that is when there is no net left or right steering. The method chosen to implement this increase in March of 1997 was independent of the value of lr, and resulted in an increase in level when a strong signal panned across the boundary.

The boost is only needed along the lr=0 axis. When lr is non zero the matrix element should not be boosted. This problem can be solved by using an additive term to the matrix elements, instead of a multiply. We define a new steering index, the boundary limited cs value with the following Matlab code:

Assume both lr and cs > 0 – we are in the left front quadrant

(assume cs and lr follow the matlab conventions of varying from 1 to 46)

% find the bounded c/s

if (cs < 24)

bcs = cs-(lr-1);

if (bcs<1) % this limits the maximum value

bcs = 1;

end

else

bcs = 47-cs-(lr-1);

if (bcs <1)

bcs = 1;

end

end

If cs < 22.5 and lr =0, (In matlab convention cs < 24 and lr = 1) bcs is equal to cs. However as lr increases bcs will decrease to zero. If cs > 22.5, as lr increases bcs also decreases.

Now to find the correction function needed, we find the difference between the boosted matrix elements and the non-boosted ones, along the lr=0 axis. We call this difference cos_tbl_plus and sin_tbl_plus. (This code is written in a matlab, where variables are multivalued vectors. It takes some getting used to.)

a = 0:45 % define a vector in one degree steps. a has the values of 0 to 45 degrees

a1 = 2*pi*a/360; % convert to radians

% now define the sine and cosine tables, as well as the boost tables for the front

sin_tbl = sin(a1);

cos_tbl = cos(a1);

cos_tbl_plus = cos(a1).*corr1(a+1);

cos_tbl_plus = cos_tbl_plus-cos_tbl; % this is the one we use

cos_tbl_minus = cos(a1)./corr(a+1);

sin_tbl_plus = sin(a1).*corr1(a+1);

sin_tbl_plus = sin_tbl_plus-sin_tbl; % this is the one we use

sin_tbl_minus = sin(a1)./corr(a+1);

sin_tbl_plus and cos_tbl_plus are the difference between a plain sine and cosine, and the boosted sine and cosine. We now define

LFL = cos(cs) + .41*G(lr) + cos_tbl_plus(bcs)

LFR = -sin(cs) -sin_tbl_plus(bcs)

LFL and LFR in the front right quadrant are similar, but without the +.41*G term. These new definitions lead to the matrix element in Figure 6.

The steering in the rear quadrant is not optimal either. When the steering is toward the rear the above matrix elements are given by:

LFL = cos_tbl_minus(-cs) + .41*G(-cs)

LFR = sin_tbl_minus(-cs)

These matrix elements are very nearly identical to the elements in the ’89 patent. Consider the case when a strong signal pans from left to rear. The ‘89 elements were designed so that there is complete cancellation of the output from the front left output only when this signal is fully to the rear (cs = -45, lr = 0). However in a Logic 7 decoder it would be desirable that the output from the left front output should be zero when the encoded signal reaches the left rear direction (cs = -22.5 and lr = 22.5). The left front output should remain at zero as the signal pans further to full rear. The matrix elements used in March 1997, – the ones above – result in the output in the front left channel being about –9dB when a signal is panned to the left rear position. This level difference is sufficient for good performance of the matrix, but it is not as good as it could be.

This performance can be improved by altering the LFL and LFR matrix elements in the left rear quadrant. Notice that here we are concerned with how the matrix elements vary along the boundary between left and rear. The mathematical method given in the AES paper can be used to find the behavior of the elements along the boundary. Let us assume that the amplitude of the left front output should decrease with the function F(t) as t varies from 0 (left) to –22.5 degrees (left rear). The method gives the matrix elements

LFL = cos(t)*F(t) -+ sin(t)*(sqrt(1-F(t)^2))

LFR = -(sin(t)*F(t) +- cos(t)*(sqrt(1-F(t)^2)))

If we choose F(t) = cos(4*t) and choose the correct sign, these simplify to

LFL = cos(t)*cos(4*t)+sin(t)*sin(4*t)

LFR = -(sin(t)*cos(4*t)-cos(t)*sin*4*t)

See Figure 7.

These elements work fine – the front left output is reduced smoothly to zero as t varies from 0 to -22.5 degrees. We want the output to remain at zero as the steering continues form 22.5 degrees to 45 degrees (full rear.) Along this part of the boundary,

LFL = -sin(t)

LFR = cos(t)

Note that these matrix elements are a far cry from the matrix elements along the lr=0 boundary, where in the AES paper the values were

LFL = cos(cs)

LFR = sin(cs)

We need a method of smoothly transforming the above equations into the equations along the boundary as lr and cs approach the boundary. A linear interpolation could be used. In the processor used in Lexicon products, where multiplies are expensive, a better strategy is to define a new variable – the minimum of lr and cs:

% new - find the boundary parameter

bp = x;

if (bp > y)

bp = y;

end

and a new correction function which depends on bp:

for x = 1:24

ax = 2*pi*(46-x)/360;

front_boundary_tbl(x) = (cos(ax)-sin(ax))/(cos(ax)+sin(ax));

end

for x = 25:46

ax = 2*pi*(x-1)/360;

front_boundary_tbl(x) = (cos(ax)-sin(ax))/(cos(ax)+sin(ax));

end

we then define lfl and lfr in this quadrant as:

LFL = cos(cs)/(cos(cs)+sin(cs)) – front_boundary_table(bp) + .41*G(lr)

LFR = sin(cs)/(cos(cs)+sin(cs)) + front_boundary_table(bp)

Note the correction of cos(cs)+sin(cs). When we divide cos(cs) by this factor we get the function 1-0.5*G(cs), which is the same as the Dolby matrix in this quadrant.

In the right rear quadrant

LFL = cos(cs)/(cos(cs)+sin(cs))

LFR = sin(cs)/(cos(cs)+sin(cs))

See Figures 8 and 9

One of the major design goals of the design of the Logic 7 matrix is that the loudness in any given output of unsteered material presented to the inputs of the decoder should be constant, regardless of the direction of a steered signal which is present at the same time. As explained previously, this means that the sum of the squares of the matrix elements for each output should be one, regardless of the steering direction. As explained before, this requirement must be relaxed when there is strong steering in the direction of the output in question. That is, if we are looking at the left front output, the sum of the squares of the matrix elements must increase by 3dB when the steering goes full left.

We can test the success of our design by plotting the square root of the sum of the squares of the matrix elements.

See Figure 10 and Figure 11.

Rear matrix elements during front steering

The rear matrix elements in the ’89 patent are given by:

For the front left quadrant

LRL = .71 – .71*G(lr) + .41*.71*G(cs)

LRR = -.71 + .41*.71*G(cs)

For the rear left

LRL = .71*(1 - G(lr)+.41*G(-cs))

LRR = .71*(1 + .41*G(-cs))

(the right half of the plane is identical but switches lrl and lrr.)

The rear matrix elements in the Dolby Pro-Logic are

For the front left quadrant

LRL = .71*(1 – G(lr) + .41*G(cs))

LRR = -.71*( 1 + .41*G(cs))

For the rear left

LRL = .71*(1 - G(lr))

LRR = -.71

(the right half of the plane is identical but switches lrl and lrr.)

A brief digression on the surround level in Dolby Pro-Logic

The Dolby elements are similar to our ’89 elements, but without the boost dependent on cs in the rear. This difference is in fact quite important. This paper somewhat disguises the way these decoders are actually used. We derive all the matrix elements with a relatively arbitrary scaling. In most cases the elements are presented as if they had a maximum value of 1.41. In fact, for technical reasons the matrix elements are all eventually scaled so they have a maximum value of less than one. In addition, when the decoder is finally put to use, the gain of each output to the loudspeaker is adjusted. To adjust them, a signal which has been encoded from the four major directions – left, center, right, and surround – with equal sound power is played, and the gain of each output is adjusted until the sound power is equal in the listening position. In practice this means that the actual level of the matrix elements is scaled so the four outputs of the decoder are equal under conditions of full steering.

The lack of a level boost in the rear direction in the Dolby decoder means that during the calibration procedure the gain of the rear outputs will be raised by 3dB relative to the other outputs. In fact, for the Dolby decoder in practice:

LRL = 1 – G(lr)

LRR = -1

The difference is not trivial. When the front elements are scaled so they have a maximum value of one when there is full steering in one direction, we find that during UNSTEERED conditions the elements from the ’89 patent have the value 0.71, and the sum of the squares of the elements has the value of one. This is NOT true of the Dolby rear elements when calibrated. LRL has the unsteered value of one, and the sum of the squares is 2, or 3dB higher than the ’89 outputs. Note that the calibration procedure results in a matrix that does NOT correspond to a "Dolby Surround" passive matrix when the matrix is unsteered. The Dolby Surround passive matrix specifies that the rear output should have the value of .71*(Ain - Bin), and the Pro-Logic matrix does not meet this specification. A result is that when the A and B inputs are decorrelated the rear output will be 3dB stronger than the others. If there are two speakers sharing the rear output each will be adjusted 3dB softer than a single rear speaker, which will make all five speakers have approximately equal sound power when the decoder inputs are uncorrelated. When the matrix elements from the ’89 patent are used, the same calibration procedure results in 3dB less sound power from the rear when the decoder inputs are uncorrelated.

The issue of how loud the rear channels should be when the inputs are decorrelated ends up as a matter of taste. When a surround encoded recording is being played one would like to have the balance as the producer intended. However with standard stereo material one must guess. After much listening it seems best if all speakers have equal loudness for uncorrelated inputs when surround material is being played. When stereo material is being played, it may be better to have the surrounds about 3dB less. The latest Logic7 design attempts to distinguish a surround recording from a stereo recording, and adjust itself accordingly. This circuit works surprisingly well. The reduction in the level of the rear elements during neutral steering with non-surround material is accomplished by what is referred to in this paper as the "tv matrix" correction. The terminology follows the convention in our March 1997 software. In the latest software this correction is made a user adjustable variable, called "soundstage". It can be either "front" (-6dB correction) "neutral" (-3dB correction) or "rear" (0dB correction).

However, the issue of rear loudness also affects the balance between the center (vocals and dialog) and other elements in the mix. To see the importance of this issue, consider what happens when we have an input to the decoder that consists of three components, an uncorrelated left and right component, and a separate and uncorrelated center component.

Ain = Lin + .71*Cin

Bin = Rin + .71*Cin

When Ain and Bin are played through a conventional stereo system, the sound power in the room will be proportional to Lin^2 + Rin^2 + Cin^2. If all three components have roughly equal amplitudes, the power ratio of the center component to the left plus right component will be 1:2.

We would like our decoder to reproduce sound power in the room with approximately the same power ratio as stereo, regardless of the power ratio of Cin to Lin and Rin. We can express this mathematically. Essentially the equal power ratio requirement will specify the functional form of the center matrix elements along the cs axis, if all the other matrix elements are taken as given. If we assume the Dolby matrix elements, calibrated such that the rear sound power is 3dB less than the other three outputs when the matrix is fully steered – i.e. 3dB less than the standard calibration, then the center matrix elements should have the shape shown in Figure 12. We can do the same thing for the standard calibration, and the results in Figure 13 emerge.

These two figures show something mix engineers are often aware of – namely that a mix prepared for playback on a Dolby Pro-Logic system can need more center loudness than a mix prepared for playback in stereo. Conversely, a mix prepared for stereo will lose vocal clarity when played over a Pro-Logic decoder. Ironically, this is not true of a passive Dolby Surround decoder.

Creating two independent rear outputs

The major problem with both the ‘89 elements and the Dolby elements is that there is only a single rear output. The ’91 patent disclosed a method for creating two independent side outputs, and the math in that patent was incorporated in the front left quadrant in the AES paper of 1996. The goal of the elements in this quadrant was to eliminate the output of a signal steered from left to center, while maintaining some output from the left rear channel for unsteered material present at the same time. To achieve this goal we assumed that the LRL matrix element would have the following form:

For the left front quadrant

LRL = 1 – GS(lr) – 0.5*G(cs)

LRR = -0.5*G(cs) – G(lr)

As can be seen, these matrix elements are very similar to the ’89 elements, but with the addition of a G(lr) term in LRR, and a GS term in LRL. G(lr) was included to add signals from the B input channel of the decoder to the left rear output, to provide some unsteered signal power as the steered signal was being removed. We then solved for the function GS(lr), using the criterion that there should be no signal output with a fully steered signal moving from left to center. The formula for GS(lr) turned out to be equal to G^2(lr), although a more complicated representation of the formula is given in the ’91 patent. The two representations can be shown to be identical.

In the AES paper these elements are corrected by being given a boost of (sin(cs)+cos(cs)) to make them closer to constant loudness for unsteered material. While completely successful in the right front quadrant, the correction is not very successful in the left front quadrant. See Figure 14.

For the right front quadrant the matrix elements are identical to the rear elements in the ’89 patent. The corrected elements as in the AES paper were used in the version of March 1997.

First consider the dip in the sum of the squares along the cs=0 axis. This dip exists because of the use of G(lr) in LRR. This choice was entirely arbitrary – although it makes implementation in analog circuitry easy. Ideally we would like to have a function GR(lr) in this equation, and choose GS(lr) and GR(lr) in such a way as to keep the sum of the squares of LRL and LRR constant along the cs=0 axis, and keep the output zero along the boundary between left and center. This can be done. We would also like to be sure the matrix elements are identical to the matrix elements in the right front quadrant along the lr=0 axis. Thus we assume

LRL = cos(cs) – GS(lr)

LRR = -sin(cs) – GR(lr)

We want the sum of the squares to be one along the cs=0 axis,

(1-GS(lr))^2 + (GR(lr))^2 = 1

and the output to be zero to a steered signal, or as t varies from zero to 45 degrees,

LRL*cos(t) + LRR*sin(t) = 0

These two equations result in a messy quadratic equation for GR and GS, which is solved numerically in Figure 15. Use of GS and GR as shown results in a large improvement along the lr=0 axis, as intended. However, the peak in the sum of the squares along the boundary between left and center remains. In a practical design it is probably not very important to compensate for this error, but we can attempt to do so with the following strategy. We will divide both matrix elements by a factor, which depends on a new combined variable based on lr and cs. Call the new variable xymin. ( In practice the divide can be replaced by a multiply by the inverse of the factor described below.)

% find the minimum of x or y

xymin = x;

if (xymin > y)

xymin = y;

end

if (xymin > 23)

xymin = 23;

end

note that xymin varies from zero to 22.5 degrees. If we multiply it by four, it will vary from zero to 90 degrees, and can be used below.

In the front left quadrant

LRL = (cos(cs) – GS(lr))/(1+.29*sin(4*xymin))

LRR = (-sin(cs) – GR(lr))/(1+.29*sin(4*xymin))

In the front right quadrant,

LRL = cos(cs)

LRR = -sin(cs)

In Figure 16 these matrix elements are multiplied by the "tv matrix" correction. We will call the correction for TV Matrix tvcorr(|lr|+|cs|). Tvcorr(|lr|+|cs|) is -3dB at zero, and 1 when the argument is 22.5 degrees and higher.

As explained in the previous paper, these elements are additionally multiplied by the "tv matrix" correction, which reduces the amplitude when the steering in near the middle. This factor shows up in the figure below as a valley centered on zero steering.

As of 7/31/97 the "tv matrix" correction has been modified to depend only on the absolute value of lr when cs is frontal. This will cause the surface above to remain at .71 along the lr=0 axis in the front. In this front direction the correction for TV Matrix becomes tvcorr(|lr|). Tvcorr(|lr|) is -3dB at zero, and 1 when |lr| is 22.5 degrees and higher.

 

The rear matrix elements during rear steering

The rear matrix elements given in the ’91 patent were not appropriate to a 5 channel decoder, and were modified heuristically in our CP-3 product. The AES paper presented a mathematical method to derive these elements along the boundary of the left rear quadrant. The method worked along the boundary, but resulted in discontinuities along the lr=0 axis, and along the cs=0 axis. In March of 1997 these discontinuities were repaired (mostly) by additional corrections to the matrix elements, which preserved their behavior along the steering boundaries.

For the new elements described here these errors have been corrected, first by using an interpolation along the cs=0 boundary for LRL, where the value is made to match the value of GS(lr) when cs is zero, and smoothly rises to the value given by the previous math as cs increases negatively toward the rear. In the newest software LRR interpolates along the cs=0 axis to GR(lr).

Left side/rear outputs during rear steering from Right to Right Rear

Let’s consider first the Left Rear Left and Left Rear Right matrix elements when the steering is neutral or anywhere between full right and right rear. That is, lr can vary from 0 to –45 degrees, and cs can vary from 0 to –22.5 degrees.

Under these conditions the steered component of the input should be removed from the left outputs - there should be no output from the rear left channel when the steering is toward the right or right rear.

The matrix elements given in the ’91 patent achieve this goal. They are essentially the same as the rear matrix elements in the 4 channel decoder, with the addition of the sin(cs)+ cos(cs) correction for the unsteered loudness. When this is done the matrix elements are simple. We will define two new functions, which are simply equal to sin, and cosine of cs over this range.

LRL = cos(-cs) = sri(-cs)

LRR = sin(-cs) = sric(-cs)

To complete LRL and LRR over the range of cs = 0 to -22.5 we must add a gain reduction for the "TV MATRX" mode. Once again in the "TV matrix" mode we desire 3dB less output when steering is neutral, but rising to full value when the steering is more than 22.5 degrees to the rear. Performance is improved by making this reduction sensitive to the sum of |lr| and |cs|. This is achieved in the current design by reducing both the RRR and RRL elements by 3dB when the sum of lr and cs is zero, and raising them back to their original values as the sum goes to 22.5 degrees. Once again the slope of this gain change is relatively arbitrary, as long as both RRR and RRL are altered in the same way. We will call the correction for TV Matrix tvcorr(|lr|+|cs|). Tvcorr(|lr|+|cs|) is -3dB at zero, and 1 when the argument is 22.5 degrees and higher.

LRL = cos(-cs)*tvcorr(|lr|+|cs|) = sri(-cs)*tvcorr(|lr|+|cs|)

LRR = sin(-cs)*tvcorr(|lr|+|cs|) = sric(-cs)*tvcorr(|lr|+|cs|)

Notice we have defined a new function sric(x), which is equal to sin(x) over the range of 0 to 22.5 degrees, and sri(x), which is equal to cos(x). We will use these functions again in defining the Left Rear matrix elements during Left steering.

Left side/rear outputs during rear steering from Right Rear to Rear

Now consider the same matrix elements as cs becomes greater than -22.5 degrees. As we said in the AES paper and the two patent applications, LRL should rise to one or more over this range, and LRR should decrease to zero. Simple functions fulfill this:

(remember cs is negative)

LRL = (cos(45+cs) + rboost(-cs)) = (sri(-cs) + rboost(-cs))

LRR = sin(45+cs) = sric(-cs)

The Left Rear matrix elements during right steering are now complete.

The Left Rear elements during steering from left to left rear

The behavior of the Left Rear Left and Left Rear Right elements is much more complex. The Left Rear Left element must quickly rise from zero to near maximum as lr decreases from 45 to 22.5 or to zero. The matrix elements given in the AES paper perform this, but as we showed earlier, there are problems with continuity at the cs = 0 boundary.

For the March 1997 release a solution was found which uses functions of one variable and several conditionals. In the AES paper and the problem at the cs = 0 boundary arises because on the forward side of the boundary the LRL matrix element is given by GS(lr). On the rear side the function given by the AES paper has the same end points, but is different in-between.

The mathematical method in the AES paper provides the following equations for the Left Rear matrix elements over the range 22.5 < lr < 45: (remember that t = 45-lr)

LRL = cos(45-lr)*sin(4*(45-lr))-sin(45-lr)*cos(4*(45-lr)) = sra(lr)

LRR = -(sin(45-lr).*sin(4*(45-lr))+cos(45-lr).*cos(4*(45-lr))) = -srac(lr)

If cs <= 22.5, lr can still vary from 0 to 45. The AES paper defines LRL and LRR when lr has the range 0 < lr < 22.5 of: - see Figure 6 in the AES paper.

LRL = cos(lr) = sra(lr)

LRR = -sin(lr) = -srac(lr)

The two functions sra(x) and srac(x) – are defined for 0 < lr < 45.

March 1997

In March 1997 the following technique is used to fix the discontinuity across the cs=0 boundary. In the AES paper near cs=0 LRL and LRR are both functions of a single variable. To fix the lack of continuity along the cs=0 boundary we add a function of a composite of lr and cs. The new variable is lr_bounded, the bounded difference between lr and cs. The definition of this variable is sufficiently complicated that I will present it in MATLAB.

lr_bounded = lr - cs; % find the difference

if (lr_bounded <0) % % only if lr > cs

lr_bounded = 0;

if (45-|cs| < lr_bounded) % use the smaller of the two values

lr_bounded = 45-cs;

We define a new function which is equal to the difference between equation sra(x) and (1-GSL(x)) when cs=0. This is rear_active_correct(lr_bounded).

For 0 < x< 45

Rear_active_correct(x) = sra(x) – (1 - GSL(x))

LRL = (sri(cs) + sra(lr) - rear_active_correct(lr_bounded) -1 ) * tvcorr(|lr|+|cs|)

The important point about this method is that it works when lr < 22.5, but it does not work when lr is larger. A better technique, which is used in the newest versions, is the interpolation technique, which is used for LRR.

The March 1997 version uses an interpolation technique to find LRR. Here there are two discontinuities. Along the cs=0 boundary LRR in the rear must match the LRR for the forward direction, which shows LRR = -G(lr) along the cs=0 boundary.

The choice used in March 1997 - although somewhat computationally intensive - is to employ an interpolation based on the value of cs over the range of 0 to 15 degrees. In other words, when cs is zero we employ G(lr) to find LRR. As cs increases to 15 degrees we interpolate to the value of srac(lr).

There is also the possibility of a discontinuity along the lr=0 axis. We can solve this by adding a term to LRR, which is found by using cs_bounded. The term is simply sric(cs_bounded). This term will insure continuity across the lr=0 axis.

First define cs_bounded

cs_bounded = lr - cs;

if (cs_bounded <1) % this limits the maximum value

cs_bounded = 0;

end

if (45-|lr| < cs_bounded) % use the smaller of the two values

cs_bounded = 45-lr;

end

for cs = 0 to 15

LRR = (-(srac(lr) + (srac(lr)-G(lr))*(15-cs)/15) +

sric(cs_bounded))*tvcorr(|lr|+|cs|);

for cs = 15 to 22.5

LRR = (-srac(lr) + sric(cs_bounded))*tvcorr(|lr|+|cs|);

LRL as implemented in the Logic 7 as of 8/97

In the new system LRL is computed with interpolation, just as LRR

for cs = 0 to 15

LRL = ((sra(lr) + (sra(lr)-GS(lr))*(15-cs)/15) +

sri(-cs))*tvcorr(|lr|+|cs|);

for cs = 15 to 22.5

LRL = (sra(lr) + sri(-cs))*tvcorr(|lr|+|cs|);

Rear outputs during steering from Left Rear to Full Rear

As the steering goes from left rear to full rear the elements follow the ones given in the AES paper, with the addition of the corrections for rear loudness.

For cs > 22.5, lr < 22.5

LRL = (sra(lr) + sri(cs) + rboost(cs))

LRR = -srac(lr) + sric(cs_bounded)

 

This completes the LRL and LRR matrix elements during left steering. The values for right steering can be found by swapping left and right in the definitions.

Center matrix elements

The center matrix elements in v1.11 have major differences with the center elements in the July 1996 patent application. The ’89 patent and pro logic have the following matrix elements.

For front steering

CL = 1 + .41*G(cs) - G(lr)

CR = 1 + .41*G(cs)

For rear steering

CL = 1 - G(lr)

CR = 1

Since the matrix elements have symmetry about the left/right axis, the values of CL and CR for right steering can be found by swapping CL and CR. See Figure 17.

In the July 1996 application these elements are replaced by sines and cosines

For front steering

CL = cos(45-lr)*sin(2*(45-lr))-sin(45-lr)*cos(2*(45-lr)) + .41*G(cs)

CR = sin(45-lr)*sin(2*(45-lr))+cos(45-lr)*cos(2*(45-lr))+ .41*G(cs)

These equations were never implemented. The March 1997 version is based on the steering in the ’89 patent, but with a different scaling, and a different function of cs. We found that it was important to reduce the unsteered level of the center output, and a value 4.5dB less than the Pro-Logic level was chosen. The boost function (.41*G(cs)) was changed to increase the value of the matrix elements back to the Pro-Logic value as cs increases toward center. The boost function in March of 1997 was chosen relatively arbitrarily.

In March 1997 the boost function of cs starts at zero as before, and rises with cs in such a way as CL and CR increase 4.5dB as cs goes from zero to 22.5 degrees. The increase is a constant number of dB for each dB of increase in cs. The function then changes slope, such that in the next 20 degrees the matrix elements rise another 3dB, and then hold constant. Thus when the steering is "half front" (8dB or 23 degrees) the new matrix elements are equal to the neutral values of the old matrix elements. As the steering continues to move forward, the new and the old matrix elements become equal.

The output of the center channel is thus 4.5dB less than the old output when steering in neutral, but rises to the old value when the steering is fully to the center. See Figure 18.

The solution for the center used in March 1997 is not optimal. Considerable experience with the decoder in practice has shown that the center portion of popular music recordings, and the dialog in some films, can tend to get lost when you switch between stereo (two channel) reproduction, and reproduction through the matrix. In addition, as the center channel changes in level a listener who is not equidistant from the front speakers can notice the apparent position of a center voice moving. This problem was extensively analyzed in developing the new matrix presented here. As we will see later, there is also a problem when a signal pans from left to center or from right to center along the boundary. The value above gives too low an output from the center speaker when the pan is half way between.

Center channel in the new design

The center channel output must be derived from the A and B inputs to the decoder. While it is possible to remove a strongly steered signal from the center channel output using matrix techniques, any time the steering frontal but not biased either left or right, the center channel must reproduce the sum of the A and B inputs with some gain factor, either a boost or a cut. In other words it is not possible to remove uncorrelated left and right material from the center channel. Our only option is to regulate the loudness of the center speaker. How loud should it be?

This question depends on the behavior of the left and right main outputs. The matrix values presented above for LFL and LFR are designed to remove the center component of the input signals as the steering moves forward. We can show that if the input signal has been encoded forward with a some kind of cross mixer, such as a stereo width control, the matrix elements given above (the ’89 elements, the 1996 AES paper elements, the March 1997 elements, and the ones presented earlier in this paper,) all completely restore the original separation.

However, if the input to the decoder consists of uncorrelated left and right channels to which an unrelated center channel has been added

Ain = Lin + .71*Cin

Bin = Rin + .71*Cin

then as the level of Cin increases relative to Lin and Rin the C component of the L and R front outputs of the decoder is not completely eliminated unless Cin is large compared to Lin and Rin. In general there is a bit of Cin left in the L and R front outputs. What does a listener hear?

There are two ways of calculating what a listener hears. If a listener is EXAXTLY equidistant from the Left, Right, and Center speakers they will hear the sum of the sound pressures from each speaker. This is equivalent to summing the three front outputs. Under these conditions it is easy to show that ANY reduction of the center component of the left and right speakers will result in a net loss of sound pressure from the center component, regardless of the amplitude of the center speaker. This is because the center speaker is always derived from the sum of the A and B inputs, and as its amplitude is raised the amplitude of the Lin and Rin signals must rise along with the amplitude of the Cin signal.

However if the listener is not equidistant from each speaker, the listener is much more likely to hear the sum of the sound power from each speaker, which is equivalent to the sum of the squares of the three front outputs. In fact, extensive listening has shown that in fact the sum of the powers of ALL the speakers is actually what is important, so we must consider the sum of the squares of all the outputs of the decoder, including the rear outputs.

If we want to design the matrix so the ratio of the amplitudes of Lin, Rin, and Cin are preserved when switching between stereo reproduction and matrix reproduction, the sound power of the Cin component from the center output must rise in exact proportion to the reduction in its sound power from the left and right outputs, and its reduction in the rear outputs. An additional complication is that the left and right front outputs have the level boost of up to 3dB described above. This will cause the center to need to be somewhat louder to keep the ratios constant. We can write this requirement as a set of equations for the sound power. These equations can be solved for the gain function we need for the center speaker.

We previously gave graphs showing the energy relations for a Dolby Pro-Logic decoder under various conditions. The Pro-Logic decoder is not optimal. We can do the same for our new decoder. See Figure 19

As can be seen, the needed rise in the level of the center channel is quite steep – the rise is many dB of amplitude per dB of steering value. This steep change in amplitude is audible in practice. Although the relative balance of the center channel information in a popular recording is well preserved, if one is standing close to the center speaker the sudden changes in level can be annoying. A larger problem is the very high value of center channel loudness. A consequence of this loudness is that the localization of all sources shifts strongly to the center. We tested this curve and found that the center loudspeaker dominated the front sound stage, and left-right separation was minimal.

There is a better solution. The center attenuation shown in figure 19 is derived assuming the matrix elements previously given for lfl and lfr. What if we used different elements? Specifically, do we need to be aggressive about removing the center component from the left and right front outputs?

Listening tests show that the March 1997 elements are needlessly aggressive about removing the center component. Acoustically there is no need that they should do so. The energy removed from them must be given to center loudspeaker. If we don’t remove this energy it comes from the left and right front speakers, and the sound power in the room is the same. The trick is to put just enough energy into the center speaker to fix the image there, while minimizing the effect on width.

We can find the optimal center attenuation by trial and error. We can then solve for the decrease in center level needed in the left and right front outputs to keep the power of the Cin component in the soundfield in the room constant.

Lets assume the center channel is reduced in level by 4.5 dB below the level in our ’89 decoder, or –7.5dB total attenuation. –7.5dB equals 0.42. The matrix elements for the center can be multiplied by this factor, and a new center boost function (GC) can be defined.

For front steering

CL = .42 - .42*G(lr) + GC(cs)

CR = .42 + GC(cs)

For rear steering

CL = .42 - .42*G(lr)

CR = .42

Several functions were tried for GC(cs). The one given below may not be ideal, but seems good enough. It is specified in terms of the angle cs in degrees, and was obtained by some trial and error.

In MATLAB:

center_max = .65;

center_rate = .75;

center_max2 = 1;

center_rate2 = .3;

center_rate3 = .1;

if (cs < 12)

gc(cs+1) = .42*10^(db*center_rate/(20));

tmp = gc(cs+1);

elseif (cs < 30)

gc(cs+1) = tmp*10^((cs-11)*center_rate3/(20));

if (gc(cs+1) > center_max)

gc(cs+1) = center_max;

end

else

gc(cs+1) = center_max*10^((cs-29)*center_rate2/(20));

if (gc(cs+1) > center_max2)

gc(cs+1) = center_max2;

end

end

 

This function is plotted in Figure 20.

We can solve for the needed function for LFR if we assume functions for LFL, LRL, and LRR. We want to solve for the rate that the Cin component in the left and right outputs should decrease, and then design matrix elements, which provide this rate of decrease. These matrix elements should also provide some boost of the Lin and Rin components, and should have the current shape at the left to center boundary, as well as the right to center boundary.

We assume

LFL = GP(cs)

LFR = GF(cs)

CL = .42 - .42*G(lr) + GC(cs)

CR = .42 + GC(cs)

Power from the front left and right

PLR = (GP^2+GF^2)*(Lin^2+Rin^2) + (GP-GF)^2*Cin^2

Power from the center

PC = GC^2*(Lin^2+Rin^2) + 2*GC^2*Cin^2

Power from the rears depends on the matrix elements we use. We will assume the rear channels are attenuated by three dB during forward steering, and that LRL is cos(cs) and LRR is sin(cs). From a single speaker

PREAR = (.71*(cos(cs)*(Lin+.71*Cin) – sin(cs)*(Rin + .71*Cin)))^2

If we assume Lin^2 ~= Rin^2,

For two speakers,

PREAR = .5*Cin^2((cos(cs)-sin(cs))^2) + Lin^2

The total power from all three speakers is PLR + PC + PREAR

PT = (GP^2 + GF^2 + GC^2)*(Lin^2 + Rin^2) + ((GP – GF)^2 + 2*GC^2)*Cin^2 +

PREAR

The ratio of Cin power to Lin and Rin power is: (assume Lin^2 = Rin^2)

RATIO = (((gp(cs)-gf(cs))^2 + 2*(gc(cs)^2) +.5*(cos(cs)-sin(cs))^2))*Cin^2 /

2*(gp(cs)^2 + gc(cs)^2 + gf(cs)^2) + 1) * Lin^2

RATIO = (Cin^2/Lin^2) *((gp(cs)-gf(cs))^2 + 2*(gc(cs)^2) +.5*(cos(cs)-sin(cs))^2) /

2*(gp(cs)^2 + gc(cs)^2 + gf(cs)^2) + 1)

For normal stereo, GC=0, GP=1, and GF=0. The center to LR power ratio is then

RATIO = (Cin^2/Lin^2)*0.5

If this ratio is to be constant regardless of the value of Cin^2/Lin^2 for our active matrix,

((gp(cs)-gf(cs))^2 + 2*(gc(cs)^2) +.5*(cos(cs)-sin(cs))^2) = (gp(cs)^2 + gc(cs)^2 +

gf(cs)^2) + .5)

 

The equation above can be solved numerically. If we assume the GC above.

GP = LFL as before we can see the result in Figure 21.

GF gives the shape of the LFR matrix element along the lr=0 axis, as cs increases from zero to center. We need a method of blending this behavior to that of the previous LFR element, which must be preserved along the boundary between left and center, as well as from right to center. A method of doing this when cs <= 22.5 degrees is to define a difference function between GF and sin(cs). We then limit this function in various ways.

gf_diff = sin(cs) – gf(cs);

for cs = 0:45;

if (gf_diff(cs) > sin(cs))

gf_diff(cs) = sin(cs);

end

if (gf_diff(cs) < 0)

gf_diff(cs) = 0;

end

end

% find the bounded c/s

if (y < 24)

bcs = y-(x-1);

if (bcs<1) % this limits the maximum value

bcs = 1;

end

else

bcs = 47-y-(x-1);

if (bcs <1) %> 46)

bcs = 1; %46;

end

end

The LFR element can now be written:

% this neat trick does an interpolation to the boundary

% the cost, of course, is a divide!!!

if (y < 23) % this is the easy way for half the region

lfr3d(47-x,47-y) = -sin_tbl(y)+gf_diff(bcs);

else

tmp = ((47-y-x)/(47-y))*gf_diff(y);

lfr3d(47-x,47-y) = -sin_tbl(y)+tmp;

end

Note that the sign of gf_diff is positive in the equation above. Thus gf_diff cancels the value of sin(cs), reducing the value of the element to zero along the first part of the lr=0 axis. See Figure 21.

Panning error in the center output

Figure 23 shows the new center left matrix element, using the new value of GC(cs). As it turns out, the new center function if we write it this way:

CL = .42 - .42*G(lr) + GC(cs)

CR = .42 + GC(cs)

works well along the lr=0 axis, but causes a panning error along the boundary between left and center, and between right and center. The values in the AES paper of 1996 give a smooth function of cos(2*cs) along the left boundary, which creates smooth panning between left and center. We would like our new center function to have similar behavior along this boundary.

We can make a correction to the matrix element which will do the job by adding an additional function of xymin:

center_fix_tbl = .8*(corr1-1);

CL = .42 - .42*G(lr) + GC(cs) + center_fix_table(xymin)

CR = .42 + GC(cs) + center_fix_table(xymin)

See Figure 22.

Technical details of the encoder

There are two major goals of the Logic 7 encoder. First, it should be able to encode a 5.1 channel tape in a way that allows the encoded version to be decoded by a Logic 7 decoder with minimal loss. Second, the encoded output should be stereo compatible - that is, it should sound as close as possible to a manual two channel mix of the same material. One factor in this stereo compatibility should be that the output of the encoder, when played on a standard stereo system, should give identical perceived loudness for each sound source in an original 5 channel mix. The apparent position of the sound source in stereo should also be as close as possible to the apparent position in the 5 channel original.

In discussions with the IRT in Munich it became apparent that the goal of stereo compatibility of the stereo signal as described above cannot be met by a single adjustment of the encoder. A five channel recording where all channels have equal foreground importance must be encoded as described above. This encoding requires that surround channels be mixed into the output of the encoder in such a way as the energy is preserved. That is, the total energy the output of the encoder should be the same, regardless of which input is being driven. This will include most film sources and 5 channel music sources where instruments have been assigned to all 5 loudspeakers. Although such music sources are not common at the present time, it is the author’s opinion that they will become common in the future. Music recordings where the foreground instruments are placed in the front three channels, with primarily reverberation in the rear channels, require a different encoding.

After a series of tests (at the IRT and elsewhere) it was determined that music recordings of this type were successfully encoded in a stereo compatible form when the surround channels were mixed with 3dB less power than the other channels. This -3dB level has been adopted as a standard for surround encoding in Europe, but the standard specifies that other surround levels can be used for special purposes. As we will see later, the new encoder contains active circuits which detect strong signals in the surround channels. When such signals are occasionally present, the encoder uses full surround level. If the surround inputs are consistently –6dB or less compared to the front channels, the surround gain is gradually lowered 3dB, to correspond to the European standard.

During tests with the Institute for Broadcast Technique (IRT) in Munich I found that a particular tape encoded incorrectly with the encoder described in the AES paper. A new architecture was developed to solve the problem with this tape. Although the particular tape was only marginally improved, the new encoder is clearly superior in its performance on a wide variety of difficult material. The original encoder was developed first as a passive encoder. It performed reasonably well with a variety of input signals. The new encoder will also work in a passive mode, but is primarily intended to work as an active encoder. The active circuitry corrects several small errors inherent in the design. However even without the active correction the performance is better than the previous encoder.

With extensive listening several small problems with the first encoder were discovered. Many (but not all) of these problems have been addressed in the new encoder. For example, when stereo signals are applied to both the front and the rear terminals of the encoder at the same time, the resulting encoder output is biased too far to the front. The new encoder compensates for this effect by increasing the rear bias slightly. Likewise, we have found that when a film is encoded with substantial surround content, there is a net rearward bias which can tend to reduce the signal power of dialog in the center channel. This can be important in a film, where dialog intelligibility is of paramount importance. The new encoder compensates for this effect by raising the center channel input to the encoder slightly under these conditions.

Explanation of the design:

The new encoder handles the left, center, and right signals identically to the previous design and identically to the Dolby encoder, providing the center attenuation function fcn is equal to 0.71, or -3dB.

The surround channels look more complicated than they are. The functions fc() and fs() direct the surround channels either to a path with a 90 degree phase shift relative to the front channels, or to a path with no phase shift. In the basic operation of the encoder fc is one, and fs is zero - that is, only the path which uses the 90 degree phase shifts is active.

The value crx is typically 0.38. It controls the amount of negative cross feed for each surround channel. As in the previous encoder, when there is only an input to one of the surround channels the A and B outputs have an amplitude ratio of -.38/.91, which results in a steering angle of 22.5 degrees to the rear. As usual, the total power in the two output channels is unity - that is the sum of the squares of .91 and .38 is one.

While the output of this encoder is relatively simple when only one channel is driven, it becomes problematic when both surround inputs are driven at the same time. If we drive the LS and the RS input with the same signal (a common occurrence in film), all the signals at the summing nodes are in phase, so the total level in each output channel is .38 + .91 or 1.29. This output is too strong by the factor of 1.29, or 2.2 dB. Active circuitry is included in the encoder to reduce the value of the function fc by up to 2.2 dB when the two surround channels are similar in level and phase.

Another error occurs when the two surround channels are similar in level and out of phase. In this case the two attenuation factors subtract, so the A and B outputs have equal amplitude and phase, and a level of .91-.38, or .53. This signal will decode as a center direction signal. This error is severe. The previous encoder design produced an unsteered signal under these conditions, which is reasonable. It is not reasonable that signals applied to the rear input terminals should result in a center oriented signal. Thus active circuitry is supplied which increases the value of fs when the two rear channels are similar in level and antiphase. The result of mixing both the real path and the phase shifted path for the rear channels is a 90 degree phase difference between the output channels A and B. This results in an unsteered signal, which is what we want.

In discussions at the IRT Munich I discovered that there is a European standard surround encoder. This encoder simply attenuates the two surround channels by 3dB, and adds them into the front channels. Thus the left rear channel is attenuated and added to the left front channel. This encoder has many disadvantages when encoding multichannel film sound, or recordings which have specific instruments in the surround channels. Both the loudness and the direction of these instruments will be incorrectly encoded. However this encoder works rather well with classical music, where the two surround channels are primarily reverberation. The 3dB attenuation was carefully chosen through listening tests to produce a stereo compatible encoding. I decided that our encoder should include this 3dB attenuation when classical music was being encoded, and that one could detect this condition through the relative levels of the front channels and the surround channels in the encoder.

A major function of the function fc in the surround channels is to reduce the level of the surround channels in the output mix by 3dB when the surround channels are much softer than the front channels. Circuitry is provided to compare the front and rear levels, and when the rear is less by 3dB, the value of fc is reduced to a maximum of 3dB. The maximum attenuation is reached when the rear channels are 8dB less strong than the front channels. This active circuit appears to work well. It makes the new encoder compatible with the European standard encoder for classical music. However instruments which are intended to be strong in the rear channels are encoded with full level.

There is another function of the real coefficient mixing path fs for the surround channels. When a sound is moving from the left front input to the left rear input active circuitry detects that these two inputs are similar in level and in phase. Under these conditions fc is reduced to zero and fs in increased to one. This change to real coefficients in the encoding results in a more precise decoding of this type of pan. In practice this function is probably not essential, but it seems an elegant refinement.

To summarize: Active circuits are provided to

  1. Reduce the level of the surround channels by 2.2dB when the two channels are in phase
  2. Increase the real coefficient mixing path for the rear channels sufficiently to create an unsteered condition when the two rear channels are out of phase
  3. Decrease the level of the surround channels by up to 3dB when the surround level is much less than the front levels.
  4. Increase the level and negative phase of the rear channels when their level is similar to the front channels.
  5. Make the surround channel mix use real coefficients when a sound source is panning from a front input to the corresponding rear input.

Future improvements to the encoder are likely to include a feature similar to feature 2 above for the front channels. In the current encoder when the two front channels are out of phase the encoding will cause the decoder to place the sound in the rear. We intend to detect this condition and make the resulting output unsteered.

Conclusions

This paper has presented the mathematical details of the design of a 5-2-5 matrix encoder and decoder, which is compatible with standard stereo recordings. Results of the work as shown in listening tests have been highly gratifying. While not every detail of original 5 channel recordings is preserved by the matrix, the artistic quality of the mix is well preserved. Both conventional stereo and film sound mixed for the previous standard matrix are enhanced by the new matrix.

References

  1. Patent # 4,862,502 – A four channel matrix surround decoder – David Griesinger, Inventor 1989
  2. Patent # 5,046,098 – A four channel matrix surround decoder – Douglas Mandel, Inventor
  3. Patent # 5,109,419 – A six channel matrix surround decoder – David Griesinger, Inventor, 1992
  4. Multichannel Matrix Surround Decoders for Two-Eared Listeners – David Griesinger, AES preprint # 4402, October, 1996