<< Chapter < Page Chapter >> Page >
This module discusses the implementation of a DirectShow filter designed to remove laugh tracks from audio streams. It is part of a series discussing the implementation of a real-time laugh track removal system. A link containing a working version of the filter is provided.

Real time implementation for laugh track removal

Overview

In order to make best use of the Laugh Track Assassinator 's algorithm, we need to be able to run it in real time with as wide a range of source materials as possible. To accomplish this lofty goal, we have implemented a DirectShow filter. DirectShow is Microsoft's technology for manipulating media on the Windows platform. Nearly all media players, such as Windows Media Player, Media Player Classic, and various DVD program, use DirectShow to render video and audio. By writing a DirectShow filter, our algorithm can be used to manipulate nearly any type of media, be it a DVD, an encoded movie, or a live TV video stream.

Direct show

All DirectShow operations are based on filters. Filters describe the translation of data from one source or type to another. DirectShow automatically finds what filters are needed to play a particular media file. The generated graph can be visualized in Microsoft's GraphEdit program. Here is what the generated graph looks like for a source video file with the Laugh Track Assassinator filter inserted:

Filter graph

This is the filter graph generated by Microsoft DirectShow with the Laugh Track Assassinator filter already inserted.

DirectShow has generated an AVI splitter to transform the file data into an audio and video stream. The video is then sent to the ffdshow Video Decoder filter, which is then sent to the Video Renderer . The audio stream is sent from the file, through the MP3 Decoder , an AC3Filter , the Laugh Track Assassinator , and finally rendered to the speakers through the DirectSound filter.

To create the DirectShow-compatible filter we used Microsoft's Windows SDK , and rewrote the audio transform filter example. (The Windows SDK can be downloaded from Microsoft here ). We then coded the two main steps in our algorithm: a low pass filter and a threshold detection scheme.

Low pass filter

In order to find a balance between frequency resolution and speed, we chose a 1000-point finite impulse response low pass filter . We had Matlab generate the one thousand filter weights, and then we converted them into a C++ format suitable for DirectShow. Since the filter requires 1000 previous samples to calculate one low pass filtered sample, we created a 1000 point circular buffer to hold the last 1000 samples of the input at any given time.

Finite state machine

The final step in our removal algorithm requires a threshold detection in both amplitude (vertical) and time (horizontal). The requirement for a time-based threshold meant we had to delay the input signal by at least the width of the horizontal threshold. In the end we decided on a 1 second delay to allow for the width threshold of 0.8 seconds, as well as making it easier to resynchronize the video signal with the audio afterward.

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, Elec 301 projects fall 2007. OpenStax CNX. Dec 22, 2007 Download for free at http://cnx.org/content/col10503/1.1
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Elec 301 projects fall 2007' conversation and receive update notifications?

Ask