Mr Pfister's random waffle: June 2010

Sunday 6 June 2010

Senses - How its been coded...

So there are lots of things that go into my Senses Project so here's how I have done some of them.

The Server:

This runs most of the processing algorithms, including facial detection/recognition, speech recognition and Optical Character Recognition(OCR). So lets Break them down:

Speech Recognition - Uses Windows built in Speech API (SAPI) for Speech Recognition and Text to speech

Optical Character Recognition - Uses the OCR engine built into Office 2007 Imaging

Facial Recognition - This was a tricky one to find, but in the end I am using a trial version of the .Net facial recognition framework created by Luxand. The images that get processed are extracted from the users computer and the users friends off of Facebook using the Facebook .Net API found on Codeplex

The Windows Mobile Device

This is just your run of the mill Windows Mobile .Net Compact framework app, it communicates with the Server and transmits data such as audio and images. The interface is the interesting part though, how do you design a UI for a blind person?

The UI - As the user can not see the items on the screen, whenever they move their finger over them. Text to Speech kicks in and tells them the option, when they find the one they want they just have to double tap ...simple!

Windows Embedded & .Net Micro 'thingy'

The heart of the solution is a Windows Embedded device, an eBox 3300. This runs some of the image processing shizzle and performs the solutions audio functionality. It is connected to another device running the .Net Micro Framework and has a screen which provides information as the eBox itself won't be connected to a PC.

Friday 4 June 2010

Imagine Cup 2010 - I'm through to the finals!

Well its been a busy couple of weeks but yet another piece of amazing news has happened to me. My embedded entry for this years Imagine Cup has been selected and I am off to Warsaw at the start of July to represent the UK.

So what's my big Idea?

Well after seeing a visually impaired person struggling in a supermarket I thought that technology could help, therefore I designed Senses.

It is an augmented reality system for blind and partially sighted people, incorporating visual, tactile and audio interfaces. Utilising the latest Windows Embedded, mobile and cloud technologies this project aims to improve overall quality of life, this is achieved by providing a means to better perform day to day tasks, such as reading text, identifying objects and people, and avoiding obstacles when walking.

This would be achieved by affectively adding a second set of eyes via the use of an external wide focus web camera attached to the user, and a more precise camera within a pre-existing Windows mobile device. These cameras would provide such tasks as Object Recognition, Facial Recognition and Optical Character Recognition (OCR). The augmented reality interaction between the user and the device would come via speech recognition via a wearable microphone, and response would come from text to speech functionally in a set of headphones.

If you are interested, check out my promotional video on youtube.