BUSINESS IMPACT

we're hiring

Need a career change? We're hiring top talent to join our team!

Oct 28, 2020

Handwriting OCR with Azure

Justin Buist Posted by Justin Buist

Computer vision is one area where AI has really shone in recent years. In this post we'll be taking a look at the handwriting recognition available in Read within Microsoft Azure's Cognitive Services.

I've found that in the work-from-home environment I make excessive use of my whiteboards. Yes, plural, whiteboards. I really enjoy them. What I don't enjoy is wishing I hadn't erased something after I have. It doesn't happen terribly often but because it has I end up cramming notes into every nook and cranny of it before I start erasing the oldest stuff.

Blog_Handwriting OCR with Azure

The simplest solution would be to snap a picture with my cell phone and call it good. Even better would be mounting a security camera pointed at my whiteboard so there's a nice consistent angle to the shots. If I'm going this far with it I might as well throw some AI based handwriting OCR at it so I can generate a searchable database, right? It's the natural thing to do.

Having recently had some excellent experience using text analytics within Cognitive Services I was eager to see what Read would give recognizing my whiteboard writing. I was expecting it be bamboozled by my sometimes-horrible handwriting or requiring some level of pre-processing to sharpen and clarify the input but the results from sending in the very first image were fantastic!

This blog post won't be a code-level review. For that I recommend Microsoft's Read's QuickStart guide.

(I've linked to the Python version of the QuickStart for a reason. While we will not go into the code behind this I will mention I used OpenCV for the input, image manipulation, and output elements. OpenCV is a handy tool to have in your back pocket for computer vision projects and Python is a popular language to work with it. I doubt you will regret giving the combination a shot.)

What you will get back from Read is a JSON document giving you the coordinates around each block of text and the text within. With that I can (crudely) draw a white box over the original text and draw the text back on in a variable font size that is roughly the right size.

Here you can see a before and after of what the original board looked like and a visual representation of what Read told us about it.

demo1-origdemo1-ai

There is also some additional metadata about confidence levels regarding the text predictions returned from Read but I was not concerned with that for this experiment.

At this point I have all the data I need for a simple logging of what I've had on my whiteboard any given day. Future enhancements will likely include logging it all into OneNote for me, or maybe a method to publish to a Teams channel to share with a group. Perhaps some better color matching to the original whiteboard background and marker color is in order too.

If you're interested in learning more we'd welcome a discussion with you and your team. Please contact us today!

New call-to-action
Justin Buist

About The Author

Justin Buist

Justin is a Senior Consultant at BlueGranite and has been building data driven solutions since 2000. He thoroughly enjoys working with the Microsoft Business Intelligence stack to deliver beneficial solutions to clients. His areas of specialty are SQL Server, SSIS, SSRS and custom integrations.

Latest Posts

New call-to-action