<< Chapter < Page Chapter >> Page >
In MP3 and AAC coders, the frequency resolution of the polyphase quadrature filterbank is increased using a cascaded MDCT stage. We describe that here, and give the details of the MDCT stage.

Mdct filterbanks

  • Hybrid Filter Banks: In more advanced audio coders such as MPEG “Layer-3” or MPEG“Advanced Audio Coding” (the details of which will be discussed later), the 32-band polyphase quadrature filterbank (PQF) is thought to not giveadequate frequency resolution, and so an additional stage of frequency division is cascaded onto the output of the PQF.This additional frequency division is accomplished using the so-called “Modified DCT” (MDCT) filterbank.(See [link] .)
    This is a flowchart with general movement to the right, beginning with a single arrow pointing to the right at a large box labeled N-band Polyphase Quadrature Filterbank. From the right edge of this bos are a series of arrows that each point at a series of boxes, all labeled MDCT. From each MDCT box there are four arrows of equal length and size pointing to the right, and these groups of arrows are labeled Q-bands. This is a flowchart with general movement to the right, beginning with a single arrow pointing to the right at a large box labeled N-band Polyphase Quadrature Filterbank. From the right edge of this bos are a series of arrows that each point at a series of boxes, all labeled MDCT. From each MDCT box there are four arrows of equal length and size pointing to the right, and these groups of arrows are labeled Q-bands.
    Hybrid filterbank scheme used in MPEG Layer-3 (where N = 32 and Q switches bewteen 6 and 18) and MPEG AAC (where N = 4 and Q switches between 128 and 1024).
  • Lapped Transforms: The MDCT is a so-called “lapped transform.”At the encoder, blocks of length 2 Q which overlap by Q samples are windowed and transformed, generating Q subband samples each. At the decoder, the Q subband samples are inverse-transformed and windowed.The windowed output samples are overlapped with and added to the previous Q windowed outputs to form the output stream. [link] gives an intuitive view of the coding/decoding operation, while [link] and [link] specify the specific coder/decoder implementations used in the MPEG schemes.
    This is a flowchart that contains two cartesian graphs, each with four peaked waves, and two boxes, with arrows in between the objects showing movement. The first graph is labeled overlapping input windows, and contains four peaks, with bases overlapping so that the beginning of each wave begins at the midpoint of the preceding wave. Below the right half of the horizontal axis are six dashed arrows that point down at a box labeled transform. To the right of this box are four dashed arrows that point to the right at a box labeled inverse transform. Above the inverse transform box are six more dashed arrows that point up at the second graph, which is visually identical to the first graph, except that it is labeled windowed and overlapped outputs. This is a flowchart that contains two cartesian graphs, each with four peaked waves, and two boxes, with arrows in between the objects showing movement. The first graph is labeled overlapping input windows, and contains four peaks, with bases overlapping so that the beginning of each wave begins at the midpoint of the preceding wave. Below the right half of the horizontal axis are six dashed arrows that point down at a box labeled transform. To the right of this box are four dashed arrows that point to the right at a box labeled inverse transform. Above the inverse transform box are six more dashed arrows that point up at the second graph, which is visually identical to the first graph, except that it is labeled windowed and overlapped outputs.
    A lapped transform.
    This figure is a large flowchart with a general downward direction. It begins with a series of connected boxes labeled across from left to right in a pattern x(mQ- 2Q + 1), x(mQ -2Q +2) and so on to x(mQ). Below these boxes is a single arrow labeled with an asterisk that points down at a second row of connected rectangles with the series of labels w(0), w(1), and so on to w(2Q - 1). Below these rectangles is a single small arrow pointing down labeled with an equal sign, and a series of larger arrows pointing down at a large box labeled Cosine Matrix Transformation. The positions in which the larger arrows point at the large box are labeled in a series from j = 0 to j = 2Q -1. To the right of the box are a series of arrows pointing to the right at the equations that read from top to bottom, i = 0, i = 1, and so on to a final equation,  i = Q - 1. This figure is a large flowchart with a general downward direction. It begins with a series of connected boxes labeled across from left to right in a pattern x(mQ- 2Q + 1), x(mQ -2Q +2) and so on to x(mQ). Below these boxes is a single arrow labeled with an asterisk that points down at a second row of connected rectangles with the series of labels w(0), w(1), and so on to w(2Q - 1). Below these rectangles is a single small arrow pointing down labeled with an equal sign, and a series of larger arrows pointing down at a large box labeled Cosine Matrix Transformation. The positions in which the larger arrows point at the large box are labeled in a series from j = 0 to j = 2Q -1. To the right of the box are a series of arrows pointing to the right at the equations that read from top to bottom, i = 0, i = 1, and so on to a final equation,  i = Q - 1.
    MDCT filterbank: encoder implementation.
    This figure is a large flowchart that moves generally downward. It begins with a large box labeled Cosine matrix transformation. To the left of this box are a series of arrows pointing at the box that are labeled with the equations, i = 0,  i = 1, and so on to i = Q - 1. At the base of this box are the equations j = 0, j = 1, and so on in the series to  j = 2Q - 1. From each of these equations in the series at the base are arrows labeled with asterisks pointing at different segments of a long rectangle containing hash marks. Inside the long rectangle is the label w(0) . . . w(2Q - 1). Below this rectangle is a single arrow pointing down, labeled with an equal sign, at two connected rectangles with the same width and same number of hash marks. Each of the connected rectangles is then divided into two segments because the middle hash mark is longer. The segments, from left to right, contain the captions u_m(0) . . . u_m(Q - 1), u_m(Q) . . . u_m(2Q - 1), u_m-1(0) . . . u_m-1(Q-1), and u_m-1(Q) . . . u_m-1(2Q-1). From certain points along these rectangles are arrows pointing at a row of circles containing a plus sign. below each circle is an arrow pointing down at a final row of connected boxes, labeled u(mQ) to u(mQ + Q - 1). This figure is a large flowchart that moves generally downward. It begins with a large box labeled Cosine matrix transformation. To the left of this box are a series of arrows pointing at the box that are labeled with the equations, i = 0,  i = 1, and so on to i = Q - 1. At the base of this box are the equations j = 0, j = 1, and so on in the series to  j = 2Q - 1. From each of these equations in the series at the base are arrows labeled with asterisks pointing at different segments of a long rectangle containing hash marks. Inside the long rectangle is the label w(0) . . . w(2Q - 1). Below this rectangle is a single arrow pointing down, labeled with an equal sign, at two connected rectangles with the same width and same number of hash marks. Each of the connected rectangles is then divided into two segments because the middle hash mark is longer. The segments, from left to right, contain the captions u_m(0) . . . u_m(Q - 1), u_m(Q) . . . u_m(2Q - 1), u_m-1(0) . . . u_m-1(Q-1), and u_m-1(Q) . . . u_m-1(2Q-1). From certain points along these rectangles are arrows pointing at a row of circles containing a plus sign. below each circle is an arrow pointing down at a final row of connected boxes, labeled u(mQ) to u(mQ + Q - 1).
    MDCT filterbank: decoder implementation.
  • Perfect Reconstruction: Based on the cancellation of time-domain aliasing components, Princen, Johnson,&Bradley show (in ICASSP 87 and TASSP 86 papers) that the MDCT acheives perfect-reconstruction when window { w n } is chosen so that overlapped squared copies sum to one, i.e.,
    1 = w n + Q 2 + w n 2 for 0 n Q - 1 .
    The “sine” window
    w n = sin π 2 Q n for 0 n 2 Q - 1
    is one example of a window satisfying this requirement, and it turns out to be the one used in MPEG Layer-3.
  • Frequency Resolution: With a window length that is only twice the number of transformoutputs, we cannot expect very good frequency selectivity. But, it turns out that this is not a problem.In MPEG Layer-3, sine-window MDCTs appear at the outputs of a 32-band PQF where frequency selectivity is not a critical issue due to thelimited frequency resolution of the human ear. In MPEG AAC, a 4-band PQF in conjunction with an optimized MDCT windowfunction gives frequency selectivity just above that which current psychoacoustic models deem necessary (see M. Bosi et al., "ISO/IEC MPEG-2 Advanced Audio Coding" in JAES Oct 1997).
  • Window Switching: Larger values of Q lead to increased frequency resolution but decreased time resolution.Time resolution is linked to the following: error due to the quantization of one MDCT output is spread out over 2 Q N time-domain output samples. For signals of a transient nature, choosing Q N too high leads to audible “pre-echoes.”For less transient signals, on the other hand, the same value of Q N might not be perceptible (and the increased frequency resolution might be very beneficial).Hence, most advanced coding schemes have a provision to switch between different time/frequency resolutions depending on localsignal behavior. In MPEG Layer-3, for example, Q switches between 6 and 18. This is accomplished using a sine window of length 36, a sinewindow of length 12, and intermediate windows which are used to switch between the long and short windows while retaining theperfect reconstruction property. [link] shows an example window sequence.
    this figure is a graph of nine peaked waves, each beginning and ending at the horizontal axis. They have equal amplitudes, but the wavelengths decrease incrementally until the fifth wave, which has the shortest wavelength, and then they increase symmetrically back to the maximum wavelengths of the first and ninth waves. In shape, the waves are not sinusoidal, most resembling a parabolic shape, except for the third and seventh waves, which begin with a wide ascension to maximum amplitude on the outside, continue with a horizontal segment at their local maxima, and then descend sharply with wavelengths comparable to the fourth and sixth waves. this figure is a graph of nine peaked waves, each beginning and ending at the horizontal axis. They have equal amplitudes, but the wavelengths decrease incrementally until the fifth wave, which has the shortest wavelength, and then they increase symmetrically back to the maximum wavelengths of the first and ninth waves. In shape, the waves are not sinusoidal, most resembling a parabolic shape, except for the third and seventh waves, which begin with a wide ascension to maximum amplitude on the outside, continue with a horizontal segment at their local maxima, and then descend sharply with wavelengths comparable to the fourth and sixth waves.
    Example MDCT window sequence for MPEG Layer-3.

Questions & Answers

What fields keep nano created devices from performing or assimulating ? Magnetic fields ? Are do they assimilate ?
Stoney Reply
why we need to study biomolecules, molecular biology in nanotechnology?
Adin Reply
?
Kyle
yes I'm doing my masters in nanotechnology, we are being studying all these domains as well..
Adin
why?
Adin
what school?
Kyle
biomolecules are e building blocks of every organics and inorganic materials.
Joe
anyone know any internet site where one can find nanotechnology papers?
Damian Reply
research.net
kanaga
sciencedirect big data base
Ernesto
Introduction about quantum dots in nanotechnology
Praveena Reply
what does nano mean?
Anassong Reply
nano basically means 10^(-9). nanometer is a unit to measure length.
Bharti
do you think it's worthwhile in the long term to study the effects and possibilities of nanotechnology on viral treatment?
Damian Reply
absolutely yes
Daniel
how to know photocatalytic properties of tio2 nanoparticles...what to do now
Akash Reply
it is a goid question and i want to know the answer as well
Maciej
characteristics of micro business
Abigail
for teaching engĺish at school how nano technology help us
Anassong
Do somebody tell me a best nano engineering book for beginners?
s. Reply
there is no specific books for beginners but there is book called principle of nanotechnology
NANO
what is fullerene does it is used to make bukky balls
Devang Reply
are you nano engineer ?
s.
fullerene is a bucky ball aka Carbon 60 molecule. It was name by the architect Fuller. He design the geodesic dome. it resembles a soccer ball.
Tarell
what is the actual application of fullerenes nowadays?
Damian
That is a great question Damian. best way to answer that question is to Google it. there are hundreds of applications for buck minister fullerenes, from medical to aerospace. you can also find plenty of research papers that will give you great detail on the potential applications of fullerenes.
Tarell
what is the Synthesis, properties,and applications of carbon nano chemistry
Abhijith Reply
Mostly, they use nano carbon for electronics and for materials to be strengthened.
Virgil
is Bucky paper clear?
CYNTHIA
carbon nanotubes has various application in fuel cells membrane, current research on cancer drug,and in electronics MEMS and NEMS etc
NANO
so some one know about replacing silicon atom with phosphorous in semiconductors device?
s. Reply
Yeah, it is a pain to say the least. You basically have to heat the substarte up to around 1000 degrees celcius then pass phosphene gas over top of it, which is explosive and toxic by the way, under very low pressure.
Harper
Do you know which machine is used to that process?
s.
how to fabricate graphene ink ?
SUYASH Reply
for screen printed electrodes ?
SUYASH
What is lattice structure?
s. Reply
of graphene you mean?
Ebrahim
or in general
Ebrahim
in general
s.
Graphene has a hexagonal structure
tahir
On having this app for quite a bit time, Haven't realised there's a chat room in it.
Cied
what is biological synthesis of nanoparticles
Sanket Reply
what's the easiest and fastest way to the synthesize AgNP?
Damian Reply
China
Cied
how did you get the value of 2000N.What calculations are needed to arrive at it
Smarajit Reply
Privacy Information Security Software Version 1.1a
Good
Got questions? Join the online conversation and get instant answers!
Jobilize.com Reply

Get the best Algebra and trigonometry course in your pocket!





Source:  OpenStax, An introduction to source-coding: quantization, dpcm, transform coding, and sub-band coding. OpenStax CNX. Sep 25, 2009 Download for free at http://cnx.org/content/col11121/1.2
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'An introduction to source-coding: quantization, dpcm, transform coding, and sub-band coding' conversation and receive update notifications?

Ask