<< Chapter < Page Chapter >> Page >

Joint time-frequency analysis

Short time fourier transform

After our system has isolated the whistle sound, it is input to a continuously running analysis algorithm based on the result of a Short-Time Fourier Transform. While the traditional Fourier transforms do not maintain a sense of time, the STFT allows for joint time-frequency analysis. The formula for the STFT is:

STFT { x [ n ] } = X ( m , w ) = n =-∞ x [ n ] w [ n m ] e jwn size 12{ ital "STFT" lbrace x \[ n \] rbrace =X \( m,w \) = Sum cSub { size 8{n"=-¥"} } cSup { size 8{¥} } {x \[ n \] w \[ n-m \]e rSup { size 8{-jwn} } } } {}

Individual sections of the signal are windowed and their FFT is taken. The result of this operation is a two dimensional function in terms of the offset from zero (“time”) and the frequency content of the signal. According to the Heisenberg Uncertainty Principle, however, we cannot have an arbitrary amount of resolution in both the time and frequency domains. For our application, we are only interested in detecting trends in the frequency with highest power, so time resolution is less important.

Spectrogram

Squaring the magnitude of the STFT results in a 3-D function is known as a spectrogram. The spectrogram of a whistle whose pitch goes from low to high looks like this.

Whistle spectrogram

This is a whistle of increasing frequency. Each slice of the spectrogram is an FFT of a certain windowed chunk of the signal. This embeds a sense of time in the spectrogram.

Notice that there is a dominant power (dark red) that shows a very clear upward trend over time. Each slice of the frequency axis at a particular time corresponds to a single chunk of the signals’FFTs. The dark red corresponds to the maximum of one of these FFTs, and as time passes (in the positive vertical direction), we see the frequency of highest power increases. Below is a graph of three of these component FFTs, illustrating how the peak frequency increases across time. Since pitch and frequency are essentially synonymous, we can determine what kind of whistle is input by looking at trends in the frequency with the dominant power.

Power vs. frequency for 3 spectrogram components

Here are 3 of the FFTs that comprise the spectrogram above. The frequency of maximum power is increasing.

With the dominant frequency for any given windowed input now known, our system then takes the discrete-time derivative of that frequency over the window. Calculus tells us that the derivative of a function at a point is positive for increasing functions and negative for decreasing functions. By continuously taking derivatives of these windows our system can track the basic shape of the spectrogram.

Analysis using the stft

Our analysis algorithm keeps a running buffer of the signs of these discrete-time derivatives. It then takes the magnitude signs inside the buffer. If the buffer encounters a number of continuous signs above or below a certain threshold, either positive or negative, it concludes that the input is a whistle with increasing or decreasing pitch, respectively. Finding the optimal threshold value is mostly trial and error; we looked at recordings of sine waves with constant frequencies and white noise in order to determine a reasonable upper bound for the area under the derivative curve.

There is a tradeoff between the size of the buffer and the quality of the analysis. Given that a whistle may last up to three seconds, it’s clear that the buffer need not contain that many samples in order to find and characterize the whistle. But on the same token, too few samples will result in an unusual number of false positives and change the song without any user interaction. And while this is not necessarily complex math, doing it quickly and continuously requires the window be as small as possible.

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, Elec 301 projects fall 2007. OpenStax CNX. Dec 22, 2007 Download for free at http://cnx.org/content/col10503/1.1
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Elec 301 projects fall 2007' conversation and receive update notifications?

Ask