Blog

5/4/2019

The project is coming to a close. A week ago, I gave my presentation, which I think went well. Two days ago I had my defense, which also went well. I am now getting everything ready for submission and preparing for graduation. The semester went so fast and I can't believe it is coming to a close already. Take a look at the Project section to see my completed program!

4/23/2019

This past week, I have worked to get my black and white filter working. I had been having problems because I was trying to access the pixels in a BitmapImage, which were stored in a format that I was having a hard time figuring out. After meeting with Dr. Pankratz and Dr. McVey, I found that there were easy accessor functions for Bitmap objects to access Pixels and their color. Then, instead of continuing to struggle with the BitmapImage, I changed my BitmapImage to a Bitmap, turned the image black and white and then changed it back into a BitmapImage. This worked out great, and I have the resulting image below. I also made some changes to my interface, including adding a button for directions and radio buttons for either a black and white image, or a reverse black and white image (so that if the text is white on black, it can still become black on white). I am also preparing for my presentation, and am looking forward to wrapping up my work.

4/14/2019

This week I was able to get the crop function working and add a button that analyzes the image after the crop and gray scale is applied for a more accurate read of the characters. I have been working on turning the image black and white for more accuracy, however so far it just makes the image more pixelated. This next week I plan to continue to work on this, as well as any other methods to improve accuracy that don't require a lot of time.

4/7/2019

The past two weeks has been a bit of a struggle for me. I began by trying to crop my image. This is still not working correctly. I began by whiting out everything but the selected area. This was failing for multiple reasons. I was using a one dimensional array to store and access the image. I also was using screen coordinates rather than image coordinates (pixel-sized). Using this method, my image looked like the image below on the left. Since then, I have switched to using a CroppedBitmap. I have found a way to translate the screen coordinates into pixel coordinates, but for some reason the math doesn't seem to be working at the moment, even though it works when I write it out. I am working on figuring that issue currently. I was able to create a gray scale image, and next will work on turning it black and white. Once I am able to do that, I will create options for black and white and the reverse, so that we can always get black text and a white background. Also I plan to add a button to read and output the text, and redesign the look of my app.

3/24/2019

The past two weeks I have been working on a crop function. I had tried a lot of different approaches before remembering mouse event functions, including onclick functions. I was able to get coordinates to by using mouse down and mouse up functions if the user selects their area by clicking and dragging. For now, I have the coordinates appearing in the text box so that I can see that it is working. I am currently working on shading so that the user can see the area they have selected. Next, I will turn everything outside of the selected area white, so that it is easier to process.

3/6/2019

I have been testing different preprocessing techniques. Below, I have tested with blurring, and turning the image black and white. So far with this image, the most accurate text recognition is just gray scale. I will continue to test the different techniques and figure out which processing works the best. The top two images use blurring; the bottom two do not. The bottom right is black and white inverse. The blurring does not seem to help much because it seems to not be able to find the characters then. The inverse works well, except that it does not read the word "trail". I will continue to test these processing techniques.

3/4/2019

This past week I did not progress much in my project because I was preparing for the Madison visit weekend, and making up work and classwork early since I missed a substantial part of Thursday and all of Friday. Look for a progress update midweek after I have caught up!

2/24/2019

This past week I made a bit a progress. My app is now able to take words and nicely formatted images and recognize and understand the text. I was able to accomplish this by using the open source OCR Tesseract. The screenshot below shows how it will take an image (left) and translate text in that image to text. The text is returned in a text box so that it can be edited. Going forward, I plan to work on image pre-processing so that more complex images can be read for text. Dark text in straight lines on a white background works well, but any curved or with a noisier background is not read accurately. This is what I am going to work on next.

2/17/2019

The past week I have been working very hard to get a library installed in visual studio that will allow me to process images. I finally got it to work after struggling with out of date instructions. Unfortunately, it is pretty late and I don't have time to play with it, but I look forward to doing so in the next couple of days! Since I last posted, we had the mini poster sessions where we were given feedback on our ideas. John helped me come up with the idea to launch a C++ program from C# using the library I have now successfully installed. Thanks to Dr. Pankratz's input, I will be looking into edge detection and by Joseph's suggestion I will be looking into tensorflow. I am excited to really get moving on this project in the coming days!

2/10/2019

Today I began creating my app that will run my program. I have gotten it to the point where a picture is able to be uploaded and displayed. The next things that I will be working on are finding a neural network algorithm that I can use in my program, and begin creating the capability to begin preprocessing the image by having the user select where the text is.

Below are screenshots of the app, before and after an image is uploaded.

2/4/2019

After some more research into my project, I have decided on a rough outline of my program. I will have an app or website which will run my program. Looking into OCR, the first thing that must be done is take an image, probably given by the user. That image must then pre pre-processed so that the characters stand out, and the program can single out the characters, which is the next step. This is the most involved step, where the program must be able differentiate characters from the background and other characters. This can be done using serval different techniques and requires much more research from me. It can be done using a neural network, by using templates of the characters, or other methods. The final step is to output this text in an editable way. This is to account for any errors made by the OCR and so that text is now editable that wasn't before. To sum up, my tasks include:
1. Create an app or website to house program
2. Allow to upload image
3. Program will:
i. Process image so that it is usable
   ii. Find characters from image
   iii. Recognize characters
   iv. Output the characters
4. The text in the image will be returned as editable

1/30/2019

The last two days I researched my project, and found many open source applications that do what I am trying to do, with libraries that could come in handy. I have also found outlines of what this project needs and ideas for getting started. I think that getting started with the actual writing of this project is going to be challenging with the knowledge I have right now, so I am going to continue researching for now. As a starting point, I took from the outline given on this website that gives advice on how to start building an OCR to create the steps below.

1. Optical Scanning
2. Pre-Processing
3. Segmentation
7. Feature Extraction
8. Training and Recognition
8. Post-Processing