Sunday, 6 June 2010

Senses - How its been coded...

So there are lots of things that go into my Senses Project so here's how I have done some of them.

The Server:
This runs most of the processing algorithms, including facial detection/recognition, speech recognition and Optical Character Recognition(OCR). So lets Break them down:

Speech Recognition - Uses Windows built in Speech API (SAPI) for Speech Recognition and Text to speech

Optical Character Recognition - Uses the OCR engine built into Office 2007 Imaging

Facial Recognition - This was a tricky one to find, but in the end I am using a trial version of the .Net facial recognition framework created by Luxand. The images that get processed are extracted from the users computer and the users friends off of Facebook using the Facebook .Net API found on Codeplex

The Windows Mobile Device
This is just your run of the mill Windows Mobile .Net Compact framework app, it communicates with the Server and transmits data such as audio and images. The interface is the interesting part though, how do you design a UI for a blind person?
The UI - As the user can not see the items on the screen, whenever they move their finger over them. Text to Speech kicks in and tells them the option, when they find the one they want they just have to double tap ...simple!
Windows Embedded & .Net Micro 'thingy'
The heart of the solution is a Windows Embedded device, an eBox 3300. This runs some of the image processing shizzle and performs the solutions audio functionality. It is connected to another device running the .Net Micro Framework and has a screen which provides information as the eBox itself won't be connected to a PC.

No comments: