<< Chapter < Page Chapter >> Page >
  • Intel Core i5 4200U @ 2.6 GHz (2C, 4T)
  • 8 GB of DDR3 memory @ 1600 MHz
  • NVIDIA GeForce GTX 860M (1152 CUDA cores, 1029 MHz core clock, 2500 MHz memory clock, 2 GBGGDR5 video memory)

Separable convolution

The below results were obtained by convolving the same image, resized to various image dimensions, with a separatedGaussian filter of kernel size k = 3.

Separable Convolution Speedup Results

Fig. 26. visualization of separable convolution speedup versus pixel count

Graph of Separable Convolution Speedup

Non-maximum suppression and selective thresholding

As discussed previously, our CUDA implementation of nonmaximum suppression and selective thresholding is split intotwo parallel tasks that run one after the other: computation of the gradient magnitude and angle, followed by the actualedge selection algorithm. The computations for the speedup numbers below reflect the full procedure of calculating thegradient’s magnitude and angle, selecting edges, and finally copying the results to host memory.

SPEEDUP OF NON-MAXIMUM SUPPRESSION AND SELECTIVE THRESHOLDING

Fig. 27. visualization of non-maximum suppression and selective thresholding speedup versus pixel count

Visualization of non-maximum suppression and selective thresholding speedup versus pixel count

Thresholded difference density matrix and motion area estimation

The below results reflect the speedup in the complete procedure of calculating the difference matrix D', determining thethresholded difference matrix D, building a difference density matrix D', and finally generating an image representing theestimated motion area.

SPEEDUP OF MOTION AREA ESTIMATION

Fig. 28. visualization of motion area estimation speedup versus pixel count

Visualization of motion area estimation speedup versus pixel count

Facial recognition

The below results reflect the speedup achieved using OpenCV’s object detection framework.

SPEEDUP OF FACIAL RECONGITION

Fig. 29. visualization of facial recognition speedup versus pixel count

Visualization of facial recognition speedup versus pixel count

Real-time motion detection performance

Our real-time motion detection implementation continuously repeats the following procedure in an infinite loop:

  1. Read two frames F1 and F2 from the camera in succession
  2. Apply a Gaussian low-pass filter of kernel size k = 3 to both frames
  3. Apply a Sobel edge filter in both directions
  4. Calculate the edges E1 and E2 from the image with non-maximum suppression and selective thresholding
  5. Calculate the difference matrix D and difference density matrix D'
  6. Draw an image representing the portions of the image with motion
  7. Update a live preview of F1, E1, D', and the motion area estimate

Table VII shows basic statistics on the frame rate results achieved on a 640 x 480 continuous video stream, running onthe same hardware as the benchmarks in the previous section. The frame rate was evaluated on a 10-second video streamin a well-lit environment with motion of a moderate number of edge pixels in front of the camera. All procedures (as described inthis paper) were executed in parallel on the GPU.

REAL-TIME FRAME RATE STATISTICS

Conclusion

Our CUDA implementation of both the edge detection and motion detection algorithms demonstrate that parallelized,GPU computation results in significant speedups compared to a serial, CPU implementation. In all benchmarked cases (separableconvolution with a Gaussian filter, edge detection via nonmaximum suppression and selective thresholding, and motionarea estimation from a difference density map), we find that the GPU implementation, running on a mid-range graphics card,demonstrated anywhere between a 1.5x to 4.6x speedup over a relatively high-performance, overclocked CPU. Furthermore,we observe a similar speedup trend when comparing existing OpenCV CPU and GPU CUDA implementations of HAARbasedfacial recognition. In all cases, we find that the speedup asymptotically approaches a general range as the input sizeincreases.

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, Elec 301 projects fall 2015. OpenStax CNX. Jan 04, 2016 Download for free at https://legacy.cnx.org/content/col11950/1.1
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Elec 301 projects fall 2015' conversation and receive update notifications?

Ask