<< Chapter < Page Chapter >> Page >

I. introduction

The ever-improving computational prowess of mobile devices has enabled the combination of so many previously separated functions: the cell phone is now also a TV, a camera, a wallet full of credit cards, etc. We envisioned a phone camera that could go beyond simply capturing images and videos: with the help of back-end algorithms, this photography application will also be able to capture the image at the perfect moment of a person or multiple people jumping in the air.

Ii. motivation

More often than not, when we try to capture our excitement with a photo, we choose to capture the gesture that most effectively conveys the message: smiley face, hands in the air, feet off the ground. Yet we can only stay so long in mid-air that the perfectly-timed photo proves to be a hard task. That's why we develop this algorithm today, one that captures the perfect photo of the feet-off-the-ground ballerinas, cheerleaders, and supermen. On the other hand, we haven’t been able to find an application on either Apple App store or Android App store that does the similar trick.

Iii. goal

The goal of this application is to capture the whole process of jumping and landing as a series of photos and then chooses the frame(s) where a certain reference point on the object (eg. foot) is at a certain coordinate (eg. highest off the ground) in the photo. Among the frames chose, this algorithms will again compare the quality of the image and choose the clearest image. The algorithm will cope with both one person jumping, capturing the clear frame that the feet are at the highest point of the ground, as well as multiple people jumping together, capturing the clear frame that all people’s distance from the group are well balanced and aligned. After the selection is accomplished, extra frames will be deleted to reduce RAM occupancy. Besides, we will integrate the algorithm into the embedded system of an ios device and reach maximum performance in terms of both memory and speed.

Iv. potential problem

  1. The current algorithm that focuses on recognition might not be able to capture a rapidly moving target, thus potentially leaving important frames out of consideration.
  2. Recognition might be distracted by unwanted objects within the frame.
  3. Complicated image recognition algorithm might pose a challenge to the processing power of cell phone.

V. summary of approach

The input to our program could be either a photo stream or a video featuring a group of people jumping. If there are multiple people jumping in the image, we would also ask the user to specify the number of jumpers. First of all, the program would run face and upper body detection to locate the people in the image. However, since the face/body detection algorithm is not perfect, non-human components might be recognized as faces and real faces might go undetected. To deal with the problem, we have to perform a denoising process, which comes in two parts: the first would be to eliminate the background noise that is identified as faces (false positives), and the other would be to estimate the position of unrecognized faces (false negatives). With noise properly reduced, we would expect the program to recognize in each image exactly same number of faces as the number of jumpers specified by the user. Then, we would plot the position-time and velocity-time curves for everyone’s face, and selects the frame in which the jumping height is the highest and/or the velocity is nearly 0 as the best image. In the case that multiple people jump, we would recognize the frame in which everyone is in the air and their average jumping height is the largest as the optimal shot. The program selects the frame that everyone jumps the best i.e. each person is around the highest point in his/her projectile and there is no blur in the image. The back-end algorithms will continuously track the relative position of the person by locating his/her face and body within each frame and then calculates the relative displacement of the tracked object from frame to frame. As we know from Newton’s Law, an object at the peak of its projectile has 0 vertical velocity. In this case, a very small relative displacement of face/body indicates the possibility that the person is at the highest point of his projectile, thus suggesting such frame as a candidate for output.

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, Elec 301 projects fall 2015. OpenStax CNX. Jan 04, 2016 Download for free at https://legacy.cnx.org/content/col11950/1.1
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Elec 301 projects fall 2015' conversation and receive update notifications?

Ask