An Artificial Intelligence based novel mobile solution for early detection of valvular heart disorders


Heart diseases are the number one cause of deaths globally, with three-quarters occurring in developing countries (WHO). Even I have experienced the personal loss of my paternal grandfather to heart disorders. Rural mortality rates have surpassed those of urban areas as 75% of rural primary-care is handled by unqualified practitioners owing to an acute shortage of doctors. My goal was to develop a portable, low-cost system that can be used by untrained frontline health-workers for early detection of valvular heart abnormalities such as murmurs, which claim millions of lives every year. Using Artificial Intelligence, Cloud and Mobile, I simulated stethoscope-based cardiac auscultation done by trained doctors for early diagnosis. I developed a novel automated technique to detect abnormal heartbeat patterns by applying audio pre-processing algorithms to stethoscope sounds, converting into high-quality spectrograms and classifying with state-of-the-art Cloud-based Neural Networks trained with normal and abnormal heart sounds. The system was tested with heart sound datasets from UMichigan, Pascal and PhysioNet and 95% average classification accuracy was achieved. I arrived at optimized spectrogram parameters with experimentation for higher accuracy. I also designed a digital stethoscope at ~20% cost of professional digital stethoscopes to capture heart sounds via a mobile app for real-time analysis by the Cloud platform. The infrastructure can easily be applied to other leading diseases such as respiratory disorders. The end-to-end system was successfully cross-validated with heart sounds collected in real-life, proving that it can be deployed as a holistic diagnostic tool to improve rural healthcare in India and beyond.

Question / Proposal

Heart diseases have emerged as the single most important cause of deaths worldwide. Contrary to my belief that rural lifestyle is healthier, I was shocked to find out from my grandfather that cardiac mortality rates have climbed higher in rural areas. This is attributed to shortage of accessible medical care. A 2017 study of Medical Council of India's historical data shows that there were only 4.8 practicing doctors per 10,000 of population. Hence, majority of rural population turns to informal healthcare workers who provide ~75 percent of primary-care, but have no formal medical training, hence worsening the cardiac risks. Some of the most common forms of heart disorders are valvular resulting in murmurs. My research focuses on whether a low-cost portable tool can be developed that unskilled health workers can use to detect early signs of valvular heart disorders. Cardiac auscultation, i.e. listening of sounds from the heart, is one of the primary diagnostic techniques used by trained doctors for examining the heart for abnormalities such as Murmurs. With recent advances in in Artificial Intelligence, I hypothesized that we can simulate the cardiac auscultation expertise of a trained professional and bridge the lack of expertise at the grassroots level. I envisioned real-time diagnosis with automated cardiac auscultation using latest AI algorithms, supplemented with pre-processing of heart sounds to ensure quality of training and test data. The proposed tool will provide initial diagnosis of murmurs which can be used to refer the patients for more advanced tests like ECG.


Auscultation of heart sound recordings has been shown to be valuable for the detection of disease and pathologies. Automated classification of pathology in heart sounds has been studied using methods which can be grouped into: artificial neural network-based, support vector machines, hidden Markov model-based and clustering-based approaches. However, until recently accurate classification remained a challenge due to the lack of high-quality, validated, and standardized open database of heart sound recordings and limited capabilities of Machine Learning engines to extract the abnormality features. Recent attempts have tried application of Convolutional Neural Networks (CNN) and Residual Neural Networks (RNN) to achieve better accuracy results.

In the application of AI in processing of audio signals, one of the decision points in the algorithm design is how to represent the audio signals before they can be processed by neural networks. Recent research suggests that audio signals translated as spectral images are good candidates for representation before being analysed by a CNN. Analysis of heartbeat requires a time series analysis of the frequency components of the audio signal. Spectrograms are 2D or 3D images representing sequences of spectra with time along one axis, frequency along the other, and brightness or color representing the strength of a frequency component at each time frame. So, spectrograms can be considered a very detailed image of the audio signal shown on a graph according to time and frequency with brightness or height (3D) representing amplitude.

Spectrograms can be created either using a Fourier Transform of the time based signal or approximated with a series of band-pass filterbanks. Given our digitally sampled audio data, we use the Fast Fourier Transform (FFT) method. Our spectral image consists of slices of overlapping images (“windowing”) with each slice representing the frequency components and strength at the time. This method is called Short-Time-Fourier-Transform (STFT). The size and shape of the windowing slices can be varied providing us with tunable parameters for our spectrogram image. The trade off parameters are window length, window type, FFT length and hop size. Use of shorter time window results in better timing precision at the expense of frequency precision and vice versa. The window types (Rectangular, Gaussian, Hamming, Hanning, Kaiser etc) controls side-lobe suppression and the FFT length determines the amount of spectral oversampling. We use the hanning window and experiment with other trade off parameters in our model for optimal classification results.

Once the audio signal is translated into an image, CNNs are naturally the best candidate to identify patterns within the image and classify the spectrogram. Given the natural rhythm and repetitive pattern of the cardiac cycle and a persistent signature of the abnormality (murmur and arrhythmia) within few systolic beats, we can take time slices of the the audio signal to generate the spectrogram. We do not need other time sequence based Neural Networks such as RNNs since the temporal behavior is repeated within the window of observation and different sequential patterns do not need to be learnt.  

Method / Testing and Redesign

Our system uses the following algorithm for diagnosis:

  • Record heart sound of patient using a new low-cost digital stethoscope which communicates with a mobile app via the phone’s microphone jack.
  • Upload recorded heart sound from the app into a backend Cloud-based system.
  • Segment the heart sound into slices with a fixed length. Discard the first and last clips as first is typically noisy and last is of a shorter length.
  • Convert the audio slices into time-based spectrograms (a visual representation of the spectrum of sound frequencies as a function of time).
  • Classify each spectrogram into normal or abnormal categories using deep CNN.
  • Calculate mean of all classification scores and standard deviation to remove any anomaly.
  • For the audio signals that have high classification score, store them in a CNN Training database that serves as a growing database for retraining the CNN model for improved accuracy.
  • Return overall classification to mobile app.

Below are the components of overall system architecture:

  1. Mobile App - Contains a simplified frontend for the health worker to capture heart sound by interfacing with the low-cost stethoscope through the microphone jack. Saves the heart sound as a digital audio signal (.mp3 or .wav), uploads it to the Cloud platform for analysis and returns the diagnosis to the user whether the heart sound is normal or abnormal. Developed using Android SDK.
  2. Audio-preprocessor – Cloud-based Python program that slices the heart sound captured with the stethoscope into fixed-length segments. The first and last slices are dropped from analysis. Cloud infrastructure implemented with AWS.
  3. Spectrogram converter - Cloud-based Python program that converts the audio segments into spectrograms (time-based frequencies on y-axis and linear time on the x-axis) using several helper libraries such as Scipy and Matlabplot. Performs a STFT of the audio signal and plots the spectrogram. We experiment with properties such as Hanning window “overlap factor” and frequency “bin size” to get distinguishable spectrogram features for best results in CNN classification.
  4. CNN Trainer - Cloud-based program that trains a CNN model (Inception v3) to classify spectrograms. CNN model is trained with normal and abnormal heart sounds from reliable open source repositories.
  5. CNN Classifier - Cloud-based program that categorizes spectrograms as normal or abnormal using the trained model from the CNN Trainer. CNN Trainer and the CNN Classifier are implemented using Google’s TensorFlow, an open source library for Machine Learning.
  6. Post-processor - Computes mean of all classification scores as overall result and also the standard deviation for anomalies.
  7. Digital Stethoscope – Built using an analog stethoscope head, an 8mm lapel mike (omni-directional electrical condenser with signal-to-noise ratio of 74dB, sensitivity of -30dB, frequency range of 65Hz-18KHz), and a foam insulator with vinyl tube to provide a sound insulated connection. To help record low amplitude and lower frequency band heart sound, we maximized the sensitivity, used a low pass filter, and use 32bit floating point sampling at rate of 44KHz for recording. Used Audacity to tune the audio fidelity and Sonic Visualizer for experimental view of spectrograms on a laptop.



Multiple levels of experimentation with heart sounds from different sources were conducted to demonstrate feasibility and robustness of the proposal. Open heart datasets from University of Michigan, PhysioNet and PASCAL were used to determine accuracy of classification of heart sounds. These public datasets have aggregated normal and abnormal heart sounds such as murmur and arrhythmia with the goal to encourage development of more robust automated heart sound classification techniques.

Heart sounds were used for training and testing the CNN model and establish optimal parameters for the intermediate spectrograms. Once the feasibility of the proposal was established with open source dataset, a low cost digital stethoscope was designed and developed using off-the-shelf materials widely available in the market. This was used to collect heart sounds in real-life which were classified using the CNN model trained with the open source datasets.

Experiment 1 (Michigan)

  • In the Michigan dataset, abnormal category covered a range of murmurs such as early systolic murmur, late systolic murmur, holosystolic murmur, early diastolic murmur etc. Each audio file was split into 8 second slices, which was fed into the spectrogram generator configured with the following parameters - overlap factor = 0.9, frequency binsize = 2**12, scale factor = 2.
  • The CNN Trainer configured with 4000 training steps (transfer learning) was run with spectrograms from normal and abnormal. Trained model was used to classify the test spectrograms. The accuracy of classification is captured as a probability score, that represents the confidence measure in the output of classification. Results show that the classifier recognized the normal and abnormal heart sounds correctly based on the spectrogram training.
  • We conducted a variation of the experiment where spectrogram parameters were changed to arrive at more visibly distinguishable features between normal and abnormal categories. Reducing the spectrogram overlap factor from 0.9 to 0.5, reducing the frequency binsize from 2**12 to 2**10 and increasing the scale from 2 to 20, led to favorable results. By changing the time vs frequency granularity, we found the parameters that work optimally with the CNN to be able to better discern between the normal and abnormal categories. Training and testing were rerun with the set of new spectrograms generated with the new parameters. The result demonstrates a higher classification accuracy with sharper spectrograms that have better distinguishable features between normal and abnormal categories. The smoother spectrogram with optimal parameters in Expt 1.6 results in higher accuracy of 97%, compared to 75.2% in Expt 1.1 for the same heart sound.

Experiments 2 (PhysioNet) & 3 (Pascal)

  • The experiments were rerun with larger datasets from PhysioNet and Pascal. They resulted in correct classification for over 95% of cases. Few anomalies were observed where the underlying audio clips had significant levels of noise.

Experiment 4 (Self-collected with constructed digital stethoscope)

  • The digital stethoscope developed as part of the project was used to record heart signals in real life which were correctly classified using the CNN model trained with open source datasets.




I’ve successfully created a tool with latest technologies such as Cloud, Mobile and Artificial Intelligence that can be used by untrained healthcare workers to provide early screening for valvular heart conditions such as murmurs that are primarily detected by trained doctors using cardiac auscultation. The results of my experiments using the best known open-source heart datasets from the University of Michigan, PhysioNet and Pascal demonstrate the feasibility and accuracy of the proposal for automated diagnosis of heart sounds. In addition, I have also successfully designed and developed a low-cost digital stethoscope at 15-20% cost of entry-level stethoscopes available today. It was successfully validated by collecting heart sounds in real life and analyzing them using the CNN models trained with the open-source datasets. I plan to continue building and verifying real-life datasets through trials in clinical and home environments.

In summary, the end-to-end solution with the low-cost digital stethoscope, mobile app and the Cloud based platform for audio processing, Neural Network training and classification provides an infrastructure that can be applied to other common diseases such as respiratory disorders that rely on auscultation as a primary diagnosis technique. My solution has innovated on prior art on multiple dimensions such as the design of a novel low-cost stethoscope and coming up with an innovative technique of converting heart sounds into high-quality spectrograms with distinguishable features so that trained Neural Networks can recognize heart sound patterns with high accuracy. The system also allows continuous expansion of the heart sound dataset with more adoption and improving the accuracy by retraining the network with expanded dataset. The end-to-end solution will be an effective screening and diagnostic tool in the hands of untrained medical frontline workers (or ‘quacks’) who provide majority of primary care in rural areas and can be an important step towards improving the dire state of rural healthcare in India and beyond. As an extension to the project, I plan to come up with effective noise cancellation techniques to further improve the accuracy of the classification. Also, I will supplement the automated cardiac auscultation in the project with Framingham risk score for an overall heart health profile. The system can ask the users to fill in a questionnaire and calculate the Framingham risk score based on 'Hard' Coronary Framingham outcomes model, which uses the following Predictors - Age, Total cholesterol, HDL, Systolic Blood Pressure, Treatment for hypertension and Smoking status etc. This score along with the one derived from automated heart sound classification gives a very good indication of the health of the heart. The new method of automated sound classification can also have applications in other fields beyond health such as detecting illegal deforestation (through chainsaw sounds), gunshots or any other unusual sounds that are distinguishable by training with appropriate sound datasets.

About me

My name is Sachin Singh. I’m 14 years old and currently live in Bangalore, India, studying at Inventure Academy.

Ever since I was a young boy, the topic of Biology has always fascinated me. My curiosity and inquisitiveness instantly attracted me to the subject. I’ve always loved studying the science of life and living organisms unlike any other subject.

However, my love for Biology extends far beyond the classroom: I remember countless times throughout my childhood where I stood in my backyard, examining and maintaining my father’s garden, or inside a hospital, watching the doctors with wide-eyed awe!

My biggest inspiration in the field of Biology has to be my maternal grandfather. An experienced doctor, he was the one who first initiated the spark in me which developed into my love for Biology and Life Sciences. His endless perseverance and hard work as a practitioner has motivated me to emulate him as a man of Science.

I also enjoy Technolgy and Computer Science, which led me to combine my love for Biology and Life Sciences with Technology to come up with this project. Apart from that, I am a voracious reader and writer, and I wrote my first novel at 11 years of age.

To win the Google Science Fair would be an absolute dream come true! Winning such a prestigious award would provide new and unexplored pathways in the field of Science which were simply unaccessible to me before. I’d like to strive for the further development of my project, as well as a full scale deployment somewhere down the line to help resolve the numerous issues faced by the society around me.

Health & Safety

My project had both software and hardware (physical construction components). All my design, construction and coding was done in my home.

For the software component, I followed the usual best practices of operating electronics (my laptop and smartphone).

There were more precautions to be observed during construction of the digital stethoscope. I had to procure many components such as analog stethoscope, microphone, foam insulation and vinyl pipe. The construction involved dismantling the analog stethoscope, tinkering with the microphone parts and cutting the foam insulation and vinyl pipe. Both the insulation and pipe were originally long as shown:


I had to make several attempts at getting the right size of both for the stethoscope (6-8 cm), for which I used knife and scissors. Throughout the attempts, my father oversaw all my actions and guided me with advice as needed. My father's name is Puneet Singh and he can be reached at (91)9945368476 for any queries on safety measures we took.

Bibliography, references, and acknowledgements


I’d like to thank my parents for their never-ending support of this project, no matter how tough the obstacles we faced were. I wouldn’t have been able to complete this project without their constant encouragement and advice, as well as their constant co-operation and brainstorming whenever possible. A special thanks to my mother for helping me proof-read the report and getting me in touch with some experts in the field for their inputs to help develop this project to the fullest extent. I am also grateful to my elder brother for his guidance on the Computer Science aspects.

I’d also like to thank my grandfather. If I ever found myself stuck on a problem, I would always call to get  the opinions from his vast experience. I consider myself lucky to call such a talented doctor and a kind and caring person my grandfather.


[1] E. Braunwald, Braunwald’s heart disease: a textbook of cardiovascular medicine. Philadelphia, PA: Elsevier/Saunders, 2015

[2] Managing Heart Diseases:Is Investing In Cardiac Care Plan the Wy to Go.[Online].Available:

[3] World Health Organization Cardiovascular diseases (CVDs).[Online].Available:

[4] D. Prabhakaran, P. Jeemon and A. Roy, “Cardiovascular Diseases in India: Current Epidemiology and Future Directions,” in Circulation, 2016, pp 1605-20.

[5] Heart Diseases in India: What Statistics Show.[Online]. Available:

[6] On the quack track: Over 50% 'doctors' in country practicing without formal degree.[Online]. Available:

[7] A. Panagariya, A, “The Challenges and innovative solutions to rural health dilemma,” in Annals of Neurosciences, 2014, pp 125–127

[8] Rural Healthcare: Towards a Healthy Rural India.[Online]. Available:

[9] B. Potnuru, “Aggregate availability of doctors in India: 2014-2030,” in Indian Journal of Public Health, 2017, pp 182-187

[10] J. Das, A. Chowdhury, R Hussam and AV Banerjee, “The impact of training informal
health care providers in India: A randomized controlled trial,” in Science, 2016

[11] P. Pulla, “Are India’s quacks the answer to its shortage of doctors?”, in BMJ, 2016

[12] M. N. Krishnan, “Coronary heart disease and risk factors in India – On the brink of an epidemic?,” in Indian Heart Journal, 2012, pp 364–367

[13] Heart Disease Facts.[Online].Available:

[14] E. M. Brown, T.S. Leung and A. P. Salmon, Heart Sounds Made Easy. Oxford, UK: Churchill Livingstone, 2002

[15] S.G. Mishra, A.K. Takke, S.T. Auti, S.V. Auryavanshi and M.J. Oza, “Role of Artificial Intelligence in Health Care,”, in Biochem Ind J, 2017

[16] Report on Impact of Cloud Computing on Healthcare Version 2.0 by Cloud Standards Customer Council.[Online].Available:

[17] R. B. D'Agostino, R. S. Vasan, M. J. Pencina, P. A. Wolf, M. Cobain, J. M. Massaro and  W. B. Kannel, “General cardiovascular risk profile for use in primary care: the Framingham Heart Study", in Circulation AHA,  2008, 117 (6) 743-53.

[18] A. Raghu, D. Praveen, D. P. Peiris, L. Tarassenko and G. Clifford, “Engineering a mobile health tool for resource-poor settings to assess and manage cardiovascular disease risk: SMARThealth study,” in BMC Medical Informatics and Decision Making, 2015

[19] H. Uguz, “A Biomedical System Based on Artificial Neural Network and Principal Component Analysis for Diagnosis of the Heart Valve Diseases,” in Journal of Medical Systems, 2010, pp 61-72

[20] S. Ari, K. Hembram and G. Saha, “Detection of cardiac abnormality from PCG signal using LMS based least square SVM classifier,” in  Expert Systems with Applications, 2010, pp 8019-8026

[21] R. Saracoglu, “Hidden Markov model-based classification of heart valve disease with PCA for dimension reduction,” in Engineering Applications of Artificial Intelligence, 2012, pp 1523-1528

[22]     A. F. Quiceno-Manrique, J. I. Godino-Llorente, M. Blanco-Velasco and G. Castellanos-Dominguez, “Selection of Dynamic Features Based on Time–Frequency Representations for Heart Murmur Detection from Phonocardiographic Signals,” in Annals of Biomedical Engineering, pp 118–137

[23] G. D. Clifford, C. Liu, B. Moody, D. Springer, I. Silva, Q. Li and R. G. Mark, “Classification of normal/abnormal heart sound recordings: The PhysioNet/Computing in Cardiology Challenge 2016,” 2016 Computing in Cardiology Conference (CinC), 2016, pp 609-612

[24] J. Rubin, R. Abreu, A. Ganguli, S. Nelaturi, I. Matei and K. Sricharan, “Recognizing Abnormal Heart Sounds Using Deep Learning,” in KHD@IJCAI, 2017

[25] S. Latif, M. Usman, R. Rana and J. Qadir, “Phonocardiographic Sensing using Deep Learning for Abnormal Heartbeat Detection,” in CoRR, 2018

[26] William Callaghan, “A Human-Machine Framework for the Classification of Phonocardiograms”. in UWSpace, 2018

[27] M. Nabih-Ali, E. El-Dahshan and A. Yahia “Heart Diseases Diagnosis Using Intelligent Algorithm Based on PCG Signal Analysis,” in Circuits and Systems 8, 2017, pp 184-190

[28] The promise of AI in audio processing.[Online].Available:

[29] L. L. Wyse, “Audio Spectrogram Representations for Processing with Convolutional Neural Networks”, in CoRR abs/1706.09559, 2017

[30] Classic Spectrograms.[Online].Available:

[31] Framingham Heart Study.[Online].Available:

[32] R. Judge and R. Mangrulkar, “Heart Sound and Murmur Library”, 2015, Retrieved from Open.Michigan Educational Resources Web site:

[33] P. Bentley,  G. Nordehn, M. Coimbra  and S. Mannor, “The PASCAL classifying heart sounds challenge 2011”,