This page is optimized for mobile devices, if you would prefer the desktop version just click here

3.3 Tutorial  (Page 2/2)

Because this filter serves to enhance the features of an image, this also increases noise.

Sobel

Sobel filtering involves using two 3 by 3 matrices to convolve with the image in order to find the gradient of the image. Although this may be an inaccurate approximation, it proves effective for our needs.

After testing with multiple files, we had come to the Gaussian filter was best suited for the current data sets. There is a convenient OpenCV function that does this, Cv2.gaussianBlur( .. ). With appropriate parameters: the width and height of the matrix that will be used to filter the image and the standard deviation of the Gaussian in the x and y directions (the greater standard deviation the less variance among the pixels after filter, i.e. greater blur)

Above is our blurred image.

Feature detection

The next step is to determine where the edges in the image exist. After filtering, there needs to be some way to take each character and define its features in some measurable value. Under OpenCV, we used the adaptiveThreshold function to take our image and decide whether a certain intensity value in the image should be a 0 or 1. Effectively, our processed image is currently a matrix of binary values. We have the provided helper function.

In deciding whether a pixel meets the threshold there are two methods: we can use an adaptive mean filter or a Gaussian. We found a Gaussian was better.The threshold type should be cv2.THRESH_BINARY_INV which turns inverts the values-pixels deemed white become black and vice versa. This is because Open Cv’s edge finding functions find whitecharacters in black backgrounds.

Above shows the adaptive threshold of the image.

From here, we can use OpenCV's findContours method to find the edges. This will return coordinates, width, and height of a rectangle around a character. Given specific properties, some rectangles may be removed, resized, or combined to accomodate special cases.

Afterwards, we call the following two helper functions, findContoursAreas() and removeOverlaps(). findContoursAreas removes countours in the list contourlist that do not meet a specified minimum height. We use this toignore small contours, like the tittles around ‘i’s and ‘j’s.

removeOverlaps goes through the rectangles list and returns a list of contours(rectangles) that do not overlap, returning the largest rectangle if overlap does occur.This is necessary because the list findContourAreas returns all countours in a image, even those that do define an actual character.For example contours that make up part of a letter like the “o” in p would beincluded in addition to the contour that we want that encloses the “p.” We then sort the list to our liking so we can read the remaining letter outlines left to right, top tobottom. We do this with trainingHelper.xsort( countour_list ). It uses a simple algorithm to sort the rectangles into sorted rows and then into columns.

Above shows the code discovering the region of interests.

Taking each rectangle, we will create a vector to hold the features each image has. First, we use a resize method to convert each rectangle into a n by n matrix of values. The values is determined by dividing the image to n by n cells and returning the average intensity of each cell. From here, we turn the n x n matrix into a row vector. We can then add other features to this vector in order to determine the character. For example, take the average of the top half and the bottom half of the image.

Classification

To classify an image, there needs to be a referenced database to compare the characters to. Our provided code solves this through the training method, which uses machine learning. Using multiple sets of pictures, we take an image, run it through all of the steps above and tell the program what the character should be. When finished, we have training data, which holds a histogram of vectors mapped to specific characters.

Back to the current image, we take its feature vector and use a k-nearest neighbor algorithm to determine which vector is most similar to the one being compared. From there, we see what the vector is mapped to and return what should be the actual letter or number.

And there you have it! Thank you for reading through the introduction of OCR. Feel free to look through the files, as there will be more resources to explain this topic.

<< Chapter < Page Page > Chapter >>

Read also:

OpenStax, Elec 301 projects fall 2014. OpenStax CNX. Jan 09, 2015 Download for free at http://legacy.cnx.org/content/col11734/1.2
Google Play and the Google Play logo are trademarks of Google Inc.
Jobilize.com uses cookies to ensure that you get the best experience. By continuing to use Jobilize.com web-site, you agree to the Terms of Use and Privacy Policy.