People suffering from Developmental-Disabilities like LIS, ALS etc. are almost entirely paralysed and this disables them to communicate in any way except using an AAC device. Estimates show that approximately 1.4% of world population suffer from such disorders which is more than the entire population of Germany. The Life Expectancy of such people is estimated at 20 yrs below average, mainly because of lack of expression. Current AAC Devices cost thousands of dollars and are slow, bulky and not generic. I decided to find a better solution - An AAC device which is affordable, faster, portable and generic.
Talk expects a person to be able to give two distinguishable exhales (by varying intensity/time) for converting into electrical signals using MEMS Microphone. The signals are processed by a microprocessor and labeled as 'Dots' - for short exhales and 'Dashes' - for longer exhales. These are further interpreted as Morse Code, converted to words/sentences and sent to another microprocessor for synthesising. Talk features two modes - one to communicate in English and other to give specific commands/phrases, and 9 different voices.
Talk has made two major breakthroughs by increasing speaking rate and becoming the world's most affordable AAC device. I got predicted results by testing the device with a person suffering from SEM and Parkinson's Disease. In future I would like to add auto-predictions to my Computing-Engine and integrate Talk with modern technology like Google Glass to make the world a better place to live people with Developmental-Disabilities.
My name is Arsh Shah Dilbagi and I like being called Robo. I'm 16 and studying in 12th Grade at DAV Public School, Panipat, India. By heart I'm a Roboticist and love 'Making Things' but every now and then I figure out time to support my other hobbies - Photography, Cinematography, Web Designing, 3D Modelling & Rendering and Swimming. I've always been fascinated by the power of Science and Mathematics. In 2010 my parents gifted me a LEGO Mindstorms Kit and since then I have never stopped making & learning. I conceived and designed a working prototype of an Autonomous UGV for which I was honored by the President of India. I won IRO 2010 at National Level, IRO 2011 and FLL 2011 at Regional Level. I have also made a Home Automation System using Arduino, WordPress Plugin and designed a Quizzing Social Network. I'm maintaining my own website - http://robo.im.
Steve Jobs, an innovator & a maker, has always always been my inspiration for combining technology with art and envisioning products like no other. I'm fascinated by Isaac Asimov for amazing Hard Science Fictions and Stephen Hawking for his work in the field of Theoretical Physics. In future, I would like to pursue Robotics. Winning GSF2014 will not only ensure a good education for me but will also strengthen my belief that I can change the world for better. The cash prize will enable me to craft the final version of my project to really help people in need.
Question / Proposal
People with Developmental-Disabilities like LIS, ALS, Tetraplegia etc. are almost entirely paralysed and cannot communicate in the normal form of speech without using an AAC device. Current AAC devices either use a still-functional muscle of the person (which varies depending on the individual) or track the movement of eyes to get a signal. Statistics have shown that Developmental-Disabilities are likely to be higher in areas of poverty and available AAC devices costs thousands of dollars making them out of reach of the most in need. Such devices have other demerits - slow speaking rate as user has to wait for the selector to come over desired letter to be selected; bulky as they comprise a tablet computer along with digital display to display the moving selector over the matrix. They are also not portable and consume a lot of power. There do exist some Brain Interface technologies but they are still in their infancy.
I was forced to think myself that in the world of technology isn't there a better and affordable solution? An AAC device which can be used by people suffering from any kind of speech disorder. A device which is generic, affordable, has faster speaking rate, portable and consumes less power.
After quite some research, I hypothesised that a pressure sensor can be used to monitor variations in breath and generate two distinguishable signals. These signals can be further processed as a binary language and synthesised into speech accordingly.
The research was divided into 2 major parts - Exploring Merits and Demerits of current AAC Devices and Finding ways by which people with Developmental-Disabilities can interact with the device.
Exploring Merits and Demerits of Current AAC Devices:
There are different types of AAC devices but my research was focused primarily on High-Tech Aided AAC devices (Speech-Generating-Devices). Though they use different ways of interaction, almost all are based on 'Selection-from-Matrix' interface.Simple Single Access Scanning Devices:
Such devices comprise a push button, a switch interface and a tablet computer. Some of them use letters in the matrix and others use symbols in the same to synthesis the expression of the user. The user has to wait for the selector to come over the right element and then push the button to select it. This renders the system very cumbersome to use and with all those peripherals not only the system becomes bulky but also not-portable costing well above $5000. The other major disadvantage of these systems is that they consume a lot of power as they have to support a tablet computer.
Head Mouse and Eye Trackers:
This technology either tracks the movement of head or that of eyes depending on person and employs the same to move the selector over matrix. Though this technology is faster as compared to that of single access scanning devices, it is considerably uncomfortable to use. The user has to always focus on the screen, killing the only medium of interaction (eyes). The system comprises a tracking unit and a tablet computer. The tracking unit only, costs around $2000 making it out of reach of the most in need.
Brain Computer Interface (BCI):
BCI System intends to detect electrical signals directly from brain to interpret what the user is trying to express. Though the idea is very novel, it is still in its infancy and no where near to be put to practical use. The sensor used for detection of signals requires large number of electrodes making the system bulky and uncomfortable to wear. The computation/processing of signals will demand expensive high-end processors.
I have also experimented with Brain Controlled Interfaces to communicate with my computer however as yet these don't work as consistently as my cheek operated switch.
Finding ways of interaction between the person and system:
Concluding, from the above research, that none of current devices meet specifications of my proposed device I moved to the next step of the research. The research conducted helped in concluding that the Brain, Eyes and at-least one muscle remain still functional in most cases. Doing further research I found that the tongue is also functional and most people can control their breath either from nose or from mouth, if not artificially breathing. At the end of my research, I was able to compile a list of ways through which people can interact with the device.
Method / Testing and Redesign
After doing the research for around 3 months, it took me another 7 months to come up with the finished product which can be used by anyone suffering from any kind of speech impairment.
1.0 Right way of Interaction and Monitoring:
I considered all the possible ways and decided to use Breath (controllable by the highest fraction of people) as the way of interaction which is a breakthrough as no one has done this before. Though monitoring from inside the nose would have been easier but I preferred external microphone for the comfort of user.
2.0 Building the First Prototype:2.1 Sensor:
I tested an Electret Microphone in quiet and natural surroundings by placing it under my nose but it was too sensitive and wasn't able to distinguish between surrounding noise and forced exhales. Thereafter I tested MEMS Microphone which gave positive response and was sensitive to breath only. It also trimmed the surface vibrations which may reach from the ear clip.
2.2 Interface and Language:
I had two options for the interface - one was traditional matrix display (Digital Display/LED Matrix) and other was the sensor with no display at all. To make the device compact, I preferred the sensor with no display. I decided to use a Binary Language for users to dictate letters. International Morse Code was selected as it enables dictation of words with least possible signals.
In the first prototype, analog signals were sent to a laptop from the microphone. These signals were processed using an algorithm which trimmed the noise and looked for patterns to distinguish between short and long exhales. The data was converted into letters by a computing engine and synthesised into speech. The algorithm and the morse computing engine were solely designed and programmed by me.
3.0 Testing the First Prototype:
It was tested by myself and friends & family. The microphone was placed under the nose and exhales were processed for synthesisation into speech. The testing was done in quiet and as well as noisy environment and the results were found to be analogous.
4.0 Building the Final Design:
In the final design, processing and synthesisation are carried-out by a custom made circuit board thereby cutting the requirement of bulky computer. It has 2 major parts - one wearable sensor and other a processing unit. Talk features 9 voices (male/female) for different age-groups and 2 modes (communication/commands-phrases). I have also added encoding for daily/routine phrases which helps the user to express in faster way.
5.0 Testing the Final Design:
After testing the final design with myself and friends & family, I was able to arrange a meeting with the Head of Neurology at Sir Ganga Ram Hospital, New Delhi and tested Talk (under supervision of doctor and in controlled environment) with a person suffering from SEM and Parkinson's Disease. The person was able to give two distinguishable signals using his breath and the device worked perfectly.
A handful of experiments were conducted and data was recorded and analysed in order to make the proposed device. All the experiments were conducted on a healthy person. List of some important experiments along with recorded data and procedure followed is given below -
Data from the Sensor:
The data from the sensors was carefully recorded and analysed to deduce patterns/trends for the algorithm to distinguish among ambient noise, short exhales and long exhales.
The sensor (from the first prototype) was placed under the nose of a healthy person. The person was then asked for short and long exhales with a gap of 3 sec between two exhales. The data was sent to the computer over USB port using an Arduino and was logged in a text file as numbers shown above. Only the data registered by short and long exhales was plotted.Observations:
Though the data was random, it reflected a definitive pattern. The length of shot exhales and long exhales were always within a definite range.
Duration of Letters:
Since speaking rate was a really important factor for the device to be successful, careful experiments were carried-out for estimating the duration of each letter.
Data from the sensor was recorded for both short/long exhales and the average duration of the same was calculated using the number of values recorded. Continuous exhales were chosen for the experiment as this helped in providing better estimation of the duration of each letter.
Accuracy of Input:
After successfully making the proposed device it was really crucial to test the accuracy because it would serve as the only medium of expression for the user.
A random sentence with 50 letters was selected and dictated using the Talk device. The experiment was conducted twice - once in controlled environment (with almost no noise) and then in natural surroundings for ensuring fairness in results.Observations:
The results showed an average accuracy of 99% - with 98% accuracy in controlled environment and 100% accuracy in natural surroundings. It was later concluded that the error was on the part of user and not because of the device.
Total cost of the Final Product:
After successfully making the proposed device nearing 100% accuracy and with considerably faster speaking rate, it was equally important to put it under the price limit. The data mentioned below confirm that the device can be made available at less than $100, even though components were sourced individually.
Conclusion / Report
Talk has definitely outlived the expectations and has come out as a device far better than the proposed one. It is quite Light and Portable being just 10x6x2 cm in dimensions and weighs not more than a regular smartphone. On a single charge it runs for more than 2 days and is quite comfortable to wear. It takes only 0.8sec to dictate the letter 'A' and 0.4sec for the letter 'E' using Talk, making it the fastest AAC device in the world. Talk can be made available at less than $100, making it the most affordable. With an accuracy of almost 100% and innovative features - 9 voices (male/female) for different age groups, 2 modes (communication/command-phrases) and encoding there is nothing which can compete with features of the device. Talk uses Breath as the medium of interaction between the user and device which also makes it unique as it has been never done before.
Impact on the World:
Talk has re-invented the AAC systems. Current AAC systems have confused the major purpose of making the disabled interact with the world with other things like controlling a computer. This not only makes devices bulky and slow but also very expensive. Talk fulfills primary function of interaction and expression for the user (the very factors which make us human) and does it quite efficiently. Being so affordable, it is in reach of the highest fraction of people and its ease-of-use allows users to catch-up in no time. Talk being so light and portable, can be used by people with any kind of speech impairment like Dysarthria instead of using pictures and cards for the purpose of expression. In nutshell, Talk has the potential to change the world by enabling people with disorders like LIS, ALS etc., speech impairments like Dysarthria and even Mutes to communicate and interact with the world like never before. Talk is the beginning of a whole new world for people with Developmental Disabilities.
In the future, I envision the device to be more intelligent and even more accessible. I'm currently working on a single and more compact circuit board which will further increase the efficiency of the device and make it lighter and more portable. I also wish to add auto-prediction to my computing engine for automatically completing sentences thereby increasing the speaking rate and machine learning allowing it to learn the difference between short and long exhales depending on each user. The wearable sensor has Micro USB-Out to integrate it with an Android Smartphone and Google Glass on which I will definitely work in the near future. This will not only make it more accessible but also easier to use. One thing which is the most important of all, I wish that Talk reaches all in need and makes this world a better place to live.
IT'S MY TURN TO CHANGE THE WORLD
Bibliography, References and Acknowledgements
- Developmental Disability: http://en.wikipedia.org/wiki/Developmental_disability
- Augmentative and Alternative Communication: http://en.wikipedia.org/wiki/Augmentative_and_alternative_communication
- Speech and Language Impairment: http://en.wikipedia.org/wiki/Speech_and_language_impairment
- Paralysis: http://en.wikipedia.org/wiki/Paralysis
- Locked-In Syndrome: http://en.wikipedia.org/wiki/Locked-in_syndrome
- Amyotrophic Lateral Sclerosis: http://en.wikipedia.org/wiki/Amyotrophic_lateral_sclerosis
- Tetraplegia: http://en.wikipedia.org/wiki/Tetraplegia
- Parkinson's Disease: http://en.wikipedia.org/wiki/Parkinson's_disease
- Dysarthria: http://en.wikipedia.org/wiki/Dysarthria
- Speech Generating Devices: http://en.wikipedia.org/wiki/Speech-generating_device
- Brain Computer Interface: http://en.wikipedia.org/wiki/Brain%E2%80%93computer_interface
- Eye Tracking: http://en.wikipedia.org/wiki/Eye_tracking
- Types of Microphone: http://en.wikipedia.org/wiki/Microphone#Varieties
- International Morse Code: http://en.wikipedia.org/wiki/Morse_code
- The Computer - Stephen Hawking: http://www.hawking.org.uk/the-computer.html
- Tutorial Series for CadSoft Eagle: https://www.youtube.com/playlist?list=PL868B73617C6F6FAD
- Getting Started with CadSoft Eagle: https://www.youtube.com/watch?v=R4DYztYB6d4
- Eagle - Board Layout by Sparkfun: https://learn.sparkfun.com/tutorials/using-eagle-board-layout
- Eagle - Schematic by Sparkfun: https://learn.sparkfun.com/tutorials/using-eagle-schematic
- Autodesk 123D Tutorial: https://www.youtube.com/watch?v=03Ju_LJlU3U
- Electronic Components: https://www.mgsuperlabs.co.in/ | http://www.rhydolabz.com/
- Simple Pre-Amp Circuit: http://www.sentex.ca/~mec1995/circ/amp1.htm
- MEMS Microphone Datasheet: https://www.sparkfun.com/datasheets/Components/General/ADMP401.pdf
- EMIC2 Schematics: http://www.grandideastudio.com/wp-content/uploads/emic2_schematic.pdf
- MEMS Microphone Breakout Board Schematics: https://www.sparkfun.com/datasheets/BreakoutBoards/ADMP401-Breakout-v13.pdf
- My Teachers for being supportive and making sure I remain focused on the project.
- Prof. Naresh Bhatnagar, D.S. Legha and Masood Khan for allowing me to use 3D Printer at IIT Delhi.
- Dr. C.S. Aggarwal - Head of Neurology at Sir Ganga Ram Hospital, New Delhi for allowing me to use Talk with a patient (with consent of his guardians), under supervision of doctor and in controlled environment.
- My friend Akshay Gulati who has been an eager volunteer during the project.
- My Parents for always being supportive and providing me with all the resources needed.