<< Chapter < Page Chapter >> Page >
In MP3 and AAC coders, the frequency resolution of the polyphase quadrature filterbank is increased using a cascaded MDCT stage. We describe that here, and give the details of the MDCT stage.

Mdct filterbanks

  • Hybrid Filter Banks: In more advanced audio coders such as MPEG “Layer-3” or MPEG“Advanced Audio Coding” (the details of which will be discussed later), the 32-band polyphase quadrature filterbank (PQF) is thought to not giveadequate frequency resolution, and so an additional stage of frequency division is cascaded onto the output of the PQF.This additional frequency division is accomplished using the so-called “Modified DCT” (MDCT) filterbank.(See [link] .)
    This is a flowchart with general movement to the right, beginning with a single arrow pointing to the right at a large box labeled N-band Polyphase Quadrature Filterbank. From the right edge of this bos are a series of arrows that each point at a series of boxes, all labeled MDCT. From each MDCT box there are four arrows of equal length and size pointing to the right, and these groups of arrows are labeled Q-bands. This is a flowchart with general movement to the right, beginning with a single arrow pointing to the right at a large box labeled N-band Polyphase Quadrature Filterbank. From the right edge of this bos are a series of arrows that each point at a series of boxes, all labeled MDCT. From each MDCT box there are four arrows of equal length and size pointing to the right, and these groups of arrows are labeled Q-bands.
    Hybrid filterbank scheme used in MPEG Layer-3 (where N = 32 and Q switches bewteen 6 and 18) and MPEG AAC (where N = 4 and Q switches between 128 and 1024).
  • Lapped Transforms: The MDCT is a so-called “lapped transform.”At the encoder, blocks of length 2 Q which overlap by Q samples are windowed and transformed, generating Q subband samples each. At the decoder, the Q subband samples are inverse-transformed and windowed.The windowed output samples are overlapped with and added to the previous Q windowed outputs to form the output stream. [link] gives an intuitive view of the coding/decoding operation, while [link] and [link] specify the specific coder/decoder implementations used in the MPEG schemes.
    This is a flowchart that contains two cartesian graphs, each with four peaked waves, and two boxes, with arrows in between the objects showing movement. The first graph is labeled overlapping input windows, and contains four peaks, with bases overlapping so that the beginning of each wave begins at the midpoint of the preceding wave. Below the right half of the horizontal axis are six dashed arrows that point down at a box labeled transform. To the right of this box are four dashed arrows that point to the right at a box labeled inverse transform. Above the inverse transform box are six more dashed arrows that point up at the second graph, which is visually identical to the first graph, except that it is labeled windowed and overlapped outputs. This is a flowchart that contains two cartesian graphs, each with four peaked waves, and two boxes, with arrows in between the objects showing movement. The first graph is labeled overlapping input windows, and contains four peaks, with bases overlapping so that the beginning of each wave begins at the midpoint of the preceding wave. Below the right half of the horizontal axis are six dashed arrows that point down at a box labeled transform. To the right of this box are four dashed arrows that point to the right at a box labeled inverse transform. Above the inverse transform box are six more dashed arrows that point up at the second graph, which is visually identical to the first graph, except that it is labeled windowed and overlapped outputs.
    A lapped transform.
    This figure is a large flowchart with a general downward direction. It begins with a series of connected boxes labeled across from left to right in a pattern x(mQ- 2Q + 1), x(mQ -2Q +2) and so on to x(mQ). Below these boxes is a single arrow labeled with an asterisk that points down at a second row of connected rectangles with the series of labels w(0), w(1), and so on to w(2Q - 1). Below these rectangles is a single small arrow pointing down labeled with an equal sign, and a series of larger arrows pointing down at a large box labeled Cosine Matrix Transformation. The positions in which the larger arrows point at the large box are labeled in a series from j = 0 to j = 2Q -1. To the right of the box are a series of arrows pointing to the right at the equations that read from top to bottom, i = 0, i = 1, and so on to a final equation,  i = Q - 1. This figure is a large flowchart with a general downward direction. It begins with a series of connected boxes labeled across from left to right in a pattern x(mQ- 2Q + 1), x(mQ -2Q +2) and so on to x(mQ). Below these boxes is a single arrow labeled with an asterisk that points down at a second row of connected rectangles with the series of labels w(0), w(1), and so on to w(2Q - 1). Below these rectangles is a single small arrow pointing down labeled with an equal sign, and a series of larger arrows pointing down at a large box labeled Cosine Matrix Transformation. The positions in which the larger arrows point at the large box are labeled in a series from j = 0 to j = 2Q -1. To the right of the box are a series of arrows pointing to the right at the equations that read from top to bottom, i = 0, i = 1, and so on to a final equation,  i = Q - 1.
    MDCT filterbank: encoder implementation.
    This figure is a large flowchart that moves generally downward. It begins with a large box labeled Cosine matrix transformation. To the left of this box are a series of arrows pointing at the box that are labeled with the equations, i = 0,  i = 1, and so on to i = Q - 1. At the base of this box are the equations j = 0, j = 1, and so on in the series to  j = 2Q - 1. From each of these equations in the series at the base are arrows labeled with asterisks pointing at different segments of a long rectangle containing hash marks. Inside the long rectangle is the label w(0) . . . w(2Q - 1). Below this rectangle is a single arrow pointing down, labeled with an equal sign, at two connected rectangles with the same width and same number of hash marks. Each of the connected rectangles is then divided into two segments because the middle hash mark is longer. The segments, from left to right, contain the captions u_m(0) . . . u_m(Q - 1), u_m(Q) . . . u_m(2Q - 1), u_m-1(0) . . . u_m-1(Q-1), and u_m-1(Q) . . . u_m-1(2Q-1). From certain points along these rectangles are arrows pointing at a row of circles containing a plus sign. below each circle is an arrow pointing down at a final row of connected boxes, labeled u(mQ) to u(mQ + Q - 1). This figure is a large flowchart that moves generally downward. It begins with a large box labeled Cosine matrix transformation. To the left of this box are a series of arrows pointing at the box that are labeled with the equations, i = 0,  i = 1, and so on to i = Q - 1. At the base of this box are the equations j = 0, j = 1, and so on in the series to  j = 2Q - 1. From each of these equations in the series at the base are arrows labeled with asterisks pointing at different segments of a long rectangle containing hash marks. Inside the long rectangle is the label w(0) . . . w(2Q - 1). Below this rectangle is a single arrow pointing down, labeled with an equal sign, at two connected rectangles with the same width and same number of hash marks. Each of the connected rectangles is then divided into two segments because the middle hash mark is longer. The segments, from left to right, contain the captions u_m(0) . . . u_m(Q - 1), u_m(Q) . . . u_m(2Q - 1), u_m-1(0) . . . u_m-1(Q-1), and u_m-1(Q) . . . u_m-1(2Q-1). From certain points along these rectangles are arrows pointing at a row of circles containing a plus sign. below each circle is an arrow pointing down at a final row of connected boxes, labeled u(mQ) to u(mQ + Q - 1).
    MDCT filterbank: decoder implementation.
  • Perfect Reconstruction: Based on the cancellation of time-domain aliasing components, Princen, Johnson,&Bradley show (in ICASSP 87 and TASSP 86 papers) that the MDCT acheives perfect-reconstruction when window { w n } is chosen so that overlapped squared copies sum to one, i.e.,
    1 = w n + Q 2 + w n 2 for 0 n Q - 1 .
    The “sine” window
    w n = sin π 2 Q n for 0 n 2 Q - 1
    is one example of a window satisfying this requirement, and it turns out to be the one used in MPEG Layer-3.
  • Frequency Resolution: With a window length that is only twice the number of transformoutputs, we cannot expect very good frequency selectivity. But, it turns out that this is not a problem.In MPEG Layer-3, sine-window MDCTs appear at the outputs of a 32-band PQF where frequency selectivity is not a critical issue due to thelimited frequency resolution of the human ear. In MPEG AAC, a 4-band PQF in conjunction with an optimized MDCT windowfunction gives frequency selectivity just above that which current psychoacoustic models deem necessary (see M. Bosi et al., "ISO/IEC MPEG-2 Advanced Audio Coding" in JAES Oct 1997).
  • Window Switching: Larger values of Q lead to increased frequency resolution but decreased time resolution.Time resolution is linked to the following: error due to the quantization of one MDCT output is spread out over 2 Q N time-domain output samples. For signals of a transient nature, choosing Q N too high leads to audible “pre-echoes.”For less transient signals, on the other hand, the same value of Q N might not be perceptible (and the increased frequency resolution might be very beneficial).Hence, most advanced coding schemes have a provision to switch between different time/frequency resolutions depending on localsignal behavior. In MPEG Layer-3, for example, Q switches between 6 and 18. This is accomplished using a sine window of length 36, a sinewindow of length 12, and intermediate windows which are used to switch between the long and short windows while retaining theperfect reconstruction property. [link] shows an example window sequence.
    this figure is a graph of nine peaked waves, each beginning and ending at the horizontal axis. They have equal amplitudes, but the wavelengths decrease incrementally until the fifth wave, which has the shortest wavelength, and then they increase symmetrically back to the maximum wavelengths of the first and ninth waves. In shape, the waves are not sinusoidal, most resembling a parabolic shape, except for the third and seventh waves, which begin with a wide ascension to maximum amplitude on the outside, continue with a horizontal segment at their local maxima, and then descend sharply with wavelengths comparable to the fourth and sixth waves. this figure is a graph of nine peaked waves, each beginning and ending at the horizontal axis. They have equal amplitudes, but the wavelengths decrease incrementally until the fifth wave, which has the shortest wavelength, and then they increase symmetrically back to the maximum wavelengths of the first and ninth waves. In shape, the waves are not sinusoidal, most resembling a parabolic shape, except for the third and seventh waves, which begin with a wide ascension to maximum amplitude on the outside, continue with a horizontal segment at their local maxima, and then descend sharply with wavelengths comparable to the fourth and sixth waves.
    Example MDCT window sequence for MPEG Layer-3.

Questions & Answers

where we get a research paper on Nano chemistry....?
Maira Reply
what are the products of Nano chemistry?
Maira Reply
There are lots of products of nano chemistry... Like nano coatings.....carbon fiber.. And lots of others..
learn
Even nanotechnology is pretty much all about chemistry... Its the chemistry on quantum or atomic level
learn
Google
da
no nanotechnology is also a part of physics and maths it requires angle formulas and some pressure regarding concepts
Bhagvanji
Preparation and Applications of Nanomaterial for Drug Delivery
Hafiz Reply
revolt
da
Application of nanotechnology in medicine
what is variations in raman spectra for nanomaterials
Jyoti Reply
I only see partial conversation and what's the question here!
Crow Reply
what about nanotechnology for water purification
RAW Reply
please someone correct me if I'm wrong but I think one can use nanoparticles, specially silver nanoparticles for water treatment.
Damian
yes that's correct
Professor
I think
Professor
Nasa has use it in the 60's, copper as water purification in the moon travel.
Alexandre
nanocopper obvius
Alexandre
what is the stm
Brian Reply
is there industrial application of fullrenes. What is the method to prepare fullrene on large scale.?
Rafiq
industrial application...? mmm I think on the medical side as drug carrier, but you should go deeper on your research, I may be wrong
Damian
How we are making nano material?
LITNING Reply
what is a peer
LITNING Reply
What is meant by 'nano scale'?
LITNING Reply
What is STMs full form?
LITNING
scanning tunneling microscope
Sahil
how nano science is used for hydrophobicity
Santosh
Do u think that Graphene and Fullrene fiber can be used to make Air Plane body structure the lightest and strongest. Rafiq
Rafiq
what is differents between GO and RGO?
Mahi
what is simplest way to understand the applications of nano robots used to detect the cancer affected cell of human body.? How this robot is carried to required site of body cell.? what will be the carrier material and how can be detected that correct delivery of drug is done Rafiq
Rafiq
if virus is killing to make ARTIFICIAL DNA OF GRAPHENE FOR KILLED THE VIRUS .THIS IS OUR ASSUMPTION
Anam
analytical skills graphene is prepared to kill any type viruses .
Anam
Any one who tell me about Preparation and application of Nanomaterial for drug Delivery
Hafiz
what is Nano technology ?
Bob Reply
write examples of Nano molecule?
Bob
The nanotechnology is as new science, to scale nanometric
brayan
nanotechnology is the study, desing, synthesis, manipulation and application of materials and functional systems through control of matter at nanoscale
Damian
Is there any normative that regulates the use of silver nanoparticles?
Damian Reply
what king of growth are you checking .?
Renato
What fields keep nano created devices from performing or assimulating ? Magnetic fields ? Are do they assimilate ?
Stoney Reply
why we need to study biomolecules, molecular biology in nanotechnology?
Adin Reply
?
Kyle
yes I'm doing my masters in nanotechnology, we are being studying all these domains as well..
Adin
why?
Adin
what school?
Kyle
biomolecules are e building blocks of every organics and inorganic materials.
Joe
Got questions? Join the online conversation and get instant answers!
Jobilize.com Reply

Get the best Algebra and trigonometry course in your pocket!





Source:  OpenStax, An introduction to source-coding: quantization, dpcm, transform coding, and sub-band coding. OpenStax CNX. Sep 25, 2009 Download for free at http://cnx.org/content/col11121/1.2
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'An introduction to source-coding: quantization, dpcm, transform coding, and sub-band coding' conversation and receive update notifications?

Ask