<< Chapter < Page
  Auto-tune   Page 1 / 1
Chapter >> Page >
Implementation of auto-tune in MATLAB.

Overview of algorithm

When pitch shifting is mentioned, most people immediately associate it with frequency shifting. Frequency shifting can be easily achieved by simply modulating the input signal by a sinusoid; however, employing such a method creates a ring modulation effect, which is not the desired effect in this case. Thus, pitch shifting and frequency shifting are not the same thing. A true pitch shift can be realized by resampling the input signal. Unfortunately, this method changes the duration of the input signal, which is also not a desired effect. It turns out that a slight modification of the second method (resampling) can be used to accurately pitch shift a signal. In order to implement (crude) auto-tuning, we just need to break up the input signal into small windows and pitch shift each window by an appropriate amount. More sophisticated phase correction algorithms are required to remove the distortions that result.

Below is a schematic that summarizes our implementation.

We will now outline our MATLAB implementation of auto-tuning. The algorithm can be broken down into three major steps:

1. determining the shift ratio for a window

The input signal is first divided into windows of length 256, modulating by Hanning windows. To increase frequency resolution, the window is zero-padded so that its length is 512. The frequency spectrum of each window is then computed using a 512-point FFT. To find the dominant note in the window, the largest peak within a specified frequency range is selected. It does not matter whether we select the peak corresponding to the fundamental frequency or a harmonic since since both are expected to be out of tune by the same ratio. The frequency of the note is easily found from the index of the peak by a linear mapping: the first peak corresponds to a frequency of 0 Hz, and the last peak corresonds to the sampling frequency.

The next step is to find the frequency on the chromatic scale (440 Hz multiplied by integer powers of the twelvth root of 2) that the identified peak needs to be shifted to. To do this, we simply map the identified peak to the closest key on the piano and find the corresponding frequency of the note. The shift ratio is the frequency corresponding to the closest piano key divided by the dominant frequency in the frequency spectrum.

2. pitch shifting a window

To pitch shift a window, we must first stretch/compress the window in time and then resample the window. In order to raise the pitch, we need to expand the window since we would like to resample at a higher frequency; similarly, lowering the pitch requires shrinking the window. For clarity, we will assume for the remainder of the section that we are interesting in raising the pitch for a given window. The steps involved in lowering the pitch are analagous.

In order to expand the window, we subdivide the window into smaller overlapping frames each of length 64, with 75% overlap, modulated by Hanning windows. Thus, each frame begins 16 samples after the previous frame begins. For a window of length 256, this will result in 13 frames. The 13 frames are then spaced out and added together so that the expanded window is larger than the original window by a factor of the shift ratio determined in the previous section.

We have now managed to stretch the window in time, but in doing so we have completely destroyed the linear phase of the window. Thus, the phase must be reconstructed. This is done by taking the FFT of each frame, adding the expected linear phase offset to the FFT coefficients in each frame by looking at the phase difference between the current frame and the previous frame, and finally taking an inverse-FFT to get the corrected frame in the time domain. We used an external package to handle these phase corrections.

To complete the pitch shift, we need to resample the window at a rate higher by a factor of the shift ratio. This is achieved by a simple linear interpolation. Note that the original length of the window is preserved since we have expanded the window and resampled the window using the same ratio.

3. recombining the windows

Finally, the pitch shifted windows are combined together. Currently, there is no phase correction after recombination, and as a result, there is audible distortion in the output. Resolving the phase discrepancies for the entire signal is a rather challenging project since the phase is nonlinear. We encourage others to expand on and improve our implementation of this final stage of the algorithm by adding phase correction.

Questions & Answers

Is there any normative that regulates the use of silver nanoparticles?
Damian Reply
what king of growth are you checking .?
What fields keep nano created devices from performing or assimulating ? Magnetic fields ? Are do they assimilate ?
Stoney Reply
why we need to study biomolecules, molecular biology in nanotechnology?
Adin Reply
yes I'm doing my masters in nanotechnology, we are being studying all these domains as well..
what school?
biomolecules are e building blocks of every organics and inorganic materials.
anyone know any internet site where one can find nanotechnology papers?
Damian Reply
sciencedirect big data base
Introduction about quantum dots in nanotechnology
Praveena Reply
what does nano mean?
Anassong Reply
nano basically means 10^(-9). nanometer is a unit to measure length.
do you think it's worthwhile in the long term to study the effects and possibilities of nanotechnology on viral treatment?
Damian Reply
absolutely yes
how to know photocatalytic properties of tio2 nanoparticles...what to do now
Akash Reply
it is a goid question and i want to know the answer as well
characteristics of micro business
for teaching engĺish at school how nano technology help us
Do somebody tell me a best nano engineering book for beginners?
s. Reply
there is no specific books for beginners but there is book called principle of nanotechnology
what is fullerene does it is used to make bukky balls
Devang Reply
are you nano engineer ?
fullerene is a bucky ball aka Carbon 60 molecule. It was name by the architect Fuller. He design the geodesic dome. it resembles a soccer ball.
what is the actual application of fullerenes nowadays?
That is a great question Damian. best way to answer that question is to Google it. there are hundreds of applications for buck minister fullerenes, from medical to aerospace. you can also find plenty of research papers that will give you great detail on the potential applications of fullerenes.
what is the Synthesis, properties,and applications of carbon nano chemistry
Abhijith Reply
Mostly, they use nano carbon for electronics and for materials to be strengthened.
is Bucky paper clear?
carbon nanotubes has various application in fuel cells membrane, current research on cancer drug,and in electronics MEMS and NEMS etc
so some one know about replacing silicon atom with phosphorous in semiconductors device?
s. Reply
Yeah, it is a pain to say the least. You basically have to heat the substarte up to around 1000 degrees celcius then pass phosphene gas over top of it, which is explosive and toxic by the way, under very low pressure.
Do you know which machine is used to that process?
how to fabricate graphene ink ?
for screen printed electrodes ?
What is lattice structure?
s. Reply
of graphene you mean?
or in general
in general
Graphene has a hexagonal structure
On having this app for quite a bit time, Haven't realised there's a chat room in it.
what is biological synthesis of nanoparticles
Sanket Reply
Got questions? Join the online conversation and get instant answers!
Jobilize.com Reply

Get the best Algebra and trigonometry course in your pocket!

Source:  OpenStax, Auto-tune. OpenStax CNX. Dec 20, 2012 Download for free at http://cnx.org/content/col11474/1.1
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Auto-tune' conversation and receive update notifications?