TBM Tumor Classifier: A mobile app for skin cancer detection at early stages

Summary

Spanish healthcare system, based in telecommunications, is inefficient and extremely slow. It takes weeks or even a month to make an appointment with the doctor and take a test, from which you will have to go to your specialist to know the results of the test. If results are unclear, you will have to take again the test, taking more time.

We propose a novel system based in automation thanks to TBM Tumor Classifier. People will be able to obtain a quick prognosis, by using our mobile application, of the nature of their cancer and those results will be directly sent to their specialist, either a dermatologist or an oncologist. Patients will be sort out according to the level of malignancy of the cancer, being the ones with malignant forms of cancer (which are more aggressive and more likely to metastasize) the first ones to be treated.

TBM Tumor Classifier plays a key role in this systems as it uses a neural network pre-trained model to analyze in vivo the tissue and then send those results to the mailing box of the input hospital after the user has filled in some personal details. This proposed model, which is more automated, accurate, precise and effective than the previous ones, can be also applied to US healthcare system.

 

 

Question / Proposal

Question 1: Are there enough features to distinguish benign and malignant forms of cancer?

Proposal 1:

To develop a mobile app that implement pre-trained machine learning model which can accurately and with a high sensibility classify skin cancer tissue according to their nature, either benign or malignant. If there are enough features we will success, otherwise, we will have to look at different things.

-----------------------------------------------------------------------------

Question 2: Can we develop an infrastructure that helps diagnosing cancer at early-stages and can therefore improve the healthcare system?

Proposal 2:

To develop a mobile app that, once diagnosed cancer using AI, is able to automatically send those results to the expert physician by two possible ways: 1) The user inputs his/her hospital by choosing it from a list of hospitals that operate with TBM Tumor Classifier. 2) A hospital is atributed to the user based on geolocalization by choosing the nearby hospital.

This makes information bidirectional so that the user can directly stay in contact with the doctor and get feedback from that or make an appointment. This method would be therefore quicker and more accurate than the actual healthcare system-method based in telecommunications.

 

 

Research

Patterns to distinguish skin cancer forms: Color and morphology

In order to distinguish skin cancer, we should look at some properties that both human eye and a camera could analyze. In our case the color and the morphology are the perfect features to extract. It is widely known among the research community that benign forms of cancer do have a regular shape with a highlighted capsule that surrounds the tumor, whereas malignant forms of cancer have an irregular shape due to cell dissemination around the tissue.1-4 Moreover, they do not have the capsule.5,13

Some research articles suggest some patterns according to the colors. Benign forms of cancers are supposed to have a stronger gradient of colors, as cells are located and gain more access to blood vessels and nutrients.1,6,7 Malignant skin cancer should have more pale colors, as cells are dividing quickly and disseminating around the tissue, decreasing the time for feeding them. 8,9,11,12 This pattern is not as stablished and reported as the morphological features characterizing benign and malignant forms. 10

As those are two of the main features we can extract from a simple image, we decided to develop a tool using neural networks to design a classifier that simply relied on color patterns and another that only analyzed morphological features.

Neural networks for image classification

Image classification techniques using neural networks have already been performed since the first artificial intelligence algorithms. The difficult task here was to make the machine distinguish color and morphological patterns. After some help and lots of tutorials, I manage to perform it with using some K-means to find colors sorted in clusters and specific morphological neural networks to analyze shape properties. 14

Mobile Application

To design the mobile application, I had to take some courses in order to implement tensorflow and keras pre-trained models into iOS and know how to develop a functional and easy to use interface. Lots of tutorials and online courses helped me gaining the ability to do so. We are currently working to develop the same interface for android systems.

 

 

Method / Testing and Redesign

Step 1 - Research

Already explained in the RESEARCH section, it involved gaining some more specific knowledge about recurrent neural networks and patterns I could use to distinguish skin cancer forms.

Step 2 – Image extraction

I extracted a set of 14320 skin cancer lesions obtained from the ISIC Dermatology Archive, which were already classified whether they were benign or malignant by dermatologist.

Step 3 - Design

         ·Part 1- Color Classifier

We used MATLAB and designed a code to convert all images into a brown gradient scale vector. From those tonalities we extracted all features relying colors using K-means and then a histogram was elaborated in order to make sure that there were significant differences between both tumor typologies.

Then all features were vectorized using a novel tool called Bag of Wards, which allow users to encode all features into vectors and distribute them into clusters based on affinities. To each cluster we attributed a single letter from a predefined vocabulary list of 200 words. Using descriptors makes classification easier and helps getting higher accuracy picks.

Classification Learner App was opened and data was selected from the workspace. Then we applied a holdout of 10%. Once finished, all classifiers test option tested each of the available classifiers and provided us a percentage with the highest one (highest success rate of classification) highlighted.

After that we designed an automated live function which allowed me to classify images in real time using my webcam.

         ·Part 2 – Morphology Classifier

The morphology classifier was, instead of using MATLAB, trained using neural networks for specific shape feature bottlenecks. Neural networks behave like our brain, they are made with nodes (like neurons) that process the input information and extract some results that are sent to another node (each node is more specific than the previous one). We previously train those nodes to distinguish some features, in our case morphology to differentiate benign and malignant tumors.

The classifier performed 5000 training steps with a learning rate of 0.01 (this provides a high accuracy). Mobilenet model was used to perform this training with a 10% holdout to validate our model.

After that we coded a live function, like the one developed for the color-based classifier, to classify images in real time using the webcam.

Step 4 - Model Validation

A set of 100 images that were not used as training data were used to perform the validation process.

Step 5 -Mobile application

After validating both models and determining which provided a higher sensibility and accuracy, an iOS app was developed using Swift programming language. The designed interface allow users to obtain a quick test and then set up their personal details as well as the hospital where they want the results to be sent. Those results are then sent to the inbox of their dermatologist (mail is already predefined when selecting the hospital) with all the information required and the diagnose, including an alarm message when the condition is malignant. See the Scheme for a proper visualization.

 

Results

Color-based Classifier

 We are not going to keep in detail the colour-based classifier as our best result was of 68.4% as described in the image. Tests with the classifier were unsuccessful for all of them and the indicated rate of accuracy, 68.4 out of 100, is not enough and not significant to take their classifications as correct ones. The reported value is the highest one even after applying some methods like increasing the holdout of 10% (test data), trying all the classifiers or reducing the set of images, from 100 to 80. Results do not variate too much and it still remains a low average of success. During image extraction there were lots of malignant which had similar colors than benign and the same happened the other way around. We suggest that a part from the variables reported in previous sections photo quality and light exposure also affected the results of the classifier, but it’s something we cannot check and which wouldn’t change the results. This model cannot be applied as a classifier.

 

Shape-based tumor classifier

Surprisingly, using bag of words for shape extraction of features and later analysis provided us a result of 82.4% of accuracy using the Subic SBM Classifier, which tries to build an optimal hyperplane that separated both sets of features with the highest distance between them. This result is more significant than the previous one, with an increase of 14% of accuracy. The graphic now provides us a larger slope, which we know the direct relation slope- accuracy; the larger the slope is, the accuracy is higher. From this graphic we can extract that there is an index of true positives when classifying the 10% Holdout data to all the benign tumors. We can conclude that this model provides us a hint that shape patterns differ from benign and malignant tumors independently (Figure 6). Close to an 83% of accuracy is a statistically significant result that we can rely on.

Validating the model

Our mobile app effectively classified 47 out of 50 images when performing real time classification using our smartphone camera and an inbuilt model using a complement of Tensorflow. This validation shows promising results of sensibility but further tests with real patients need to be performed in order to validate that the model works. Right now we are undergoing clinical assays in collaboration with Vall d’Hebron, in Barcelona, with a large group of 500 patients in order to determine the sensibility of our classifier.

Testing out the infrastructure

When user inputs the name its mail into the app, it directly sends a mail to the physician with the results of the classification and an image so that the doctor can provide feedback based on the photography. In this testing all available mails in the SQL server re-directed to my mail, so all mails would be sent there. This was only stablished for showing that the infrastructure was working properly and that the system could be implemented in local areas.

 

Conclusion

We have developed TBM Tumor Classifier, a mobile app that integrates a full infrastructure that was built from the integrative coding of two different parts: 1) A machine learning model that is able to analyze images of skin cancer conditions and classify them according to their nature, either being benign or malignant.          2) A full automated infrastructure that is able to connect users with business (insurers, healthcare systems,…) and directly sent the results achieved to the expert physician by an input from the user or by geolocalization.

We have shown that a color-based classifier is not a suitable candidate for diagnosing cancer, as there is a lot of bias due to different features that we cannot control, including exposure to light, cancer features, camera quality, among others. However, classification using morphological patterns does provide close to an 83% of accuracy, meaning that we can accuratly classify skin-cancer according to the shape of the tumors.

Therefore, TBM Tumor Classifier is the beginning of an automated world in which people make usage of mobile apps and devices in order to diagnose their diseases at early-stages. We would like to highlight the important of early-stage detection, because at that time diseases are still not aggressive and current FDA approved treatments are more effective than when they are diagnosed at late-stages of the disease.

Here we report some future improvements of TBM Tumor Classifier:

· Make a multiplatform app that suits for both Android and iOS users

· Increase the accuracy of the classifier by looking at other patterns including density,refraction of the tumors.

· Use polarized light to enhance image pre-processing methods and see if accuracy rates increase

· Integrate the model in the cloud so that we can provide users constant updating.

Overall, although TBM Tumor Classifier needs to enhance its sensibility to avoid false negatives, the infrastructure we provide could one day replace the actual healthcare system and provide people with better diagnostic tools that will help society to diagnose diseases at early-stages, increasing at that time the rate of survival of the patient by providing treatments at the right timing.

                                                    We would like to highlight our leme:

                                                                 “Quick, accurate.

                                                                  Cancer does not wait.

                                                                  Neither do you “

                                                                  Believe in TBM Tumor Classifier

 

 

 

 

About me

Hello, my name is Eric Matamoros. I live in Barcelona and I am currently studying Biochemistry degree at University of Barcelona. I have always known I wanted to become a scientist, a dream that came true when at the 10th Grade I entered the Youth and Science Program. It allowed me to deeply immerse in science during three summer internships and discover all types of scientific backgrounds.

Hello, my name is Azucena Muñoz. I live in Manzanares and I am currently enrolled in my last year of high school. I have always been passionate for science and during my life I have won multiple prizes including the Maths Olympics, Regional Oddyseus Contest  and recently the Chemistry Olimpics. Science allows me to explore new fields and push my creativity to no limits.

We met at the National Spanish Conest for Young Scientist. Both of us are passionate for cancer because of our relatives, they are only some of the millions of superheroes that fight against cancer every year. With the aim to cure them and others, and mixing my knowledge in programming, we developed a first version of TBM Tumor Classifier. It won the prize in the European contest CCIC.

Winning the prize would mean that people believe and trust in what we are doing. The prizes could enable us to get more sophisticated equipment, therefore covering more diseases. It could also be a path to meet other people as interested in science as us and with whom we could cooperate.

Health & Safety

There are not issues regarding security for the development of this project. Everything has been performed using a couple computers for an enhanced processing autonomy of the individual CPUs of each computer. The only aspect to take into account when performing high-performance machine learning training steps is the over-heating of the computer, which can lead to computer crashing in case of not using a fan or an ice block to cool it down.

There are no security rules to be followed for the development of this project.

Bibliography, references, and acknowledgements

Data:

Data was obtained from the International Skin Imaging Collaboration (ISIC) archive. Images can be downloaded and accessed through the link https://www.isic-archive.com/#!/topWithHeader/wideContentTop/main

References:

1. National Cancer Institute Webpage. Annual report of Cancer Facts and Figures 2016 (http://www.cancer.org/research/cancerfactsstatistics/cancerfactsfigures2016/). Last time accessed: 12/01/2017

2. National Cancer Institue Webpage (https://www.cancer.gov/about-cancer/diagnosis-staging/diagnosis). Last time accessed: 4/01/2017

3. Baba, A. I., & Câtoi, C. (2007). Comparative Oncology [Book]. The Publishing House of the Romanian Academy. Last time accessed: 4/01/2017

4. Archer MC. Tannock, Hill, editors. 1987. Chemical Carcinogenesis. The Basic of Oncology Pergamon Press; New York: 89–105.

5. Bannasch, P. (1984). Sequential cellular changes during chemical carcinogenesis. Journal of Cancer Research and Clinical Oncology, 108(1), 11–22.

6. Raikhlin NT. Ultrastructural organ-specificity and polymorphism of the cancer cells. Neoplasma. 1973;20:567–578

7. Salomon JC. Les métastases des cancers. La Recherche. 1982;129:52–60.

8. Koyuncuer, A. (2014). Histopathological evaluation of non-melanoma skin cancer. World Journal of Surgical Oncology, 12, 159.

9. Ferlicot, S., Vincent-Salomon, A., Médioni, J., Genin, P., Rosty, C., Sigal-Zafrani, B., … Gusterson, A. (2004). Wide metastatic spreading in infiltrating lobular carcinoma of the breast. European Journal of Cancer, 40(3), 336–341.

10. Tan, K.-B., Tan, S.-H., Aw, D. C.-W., Jaffar, H., Lim, T.-C., Lee, S.-J., & Lee, Y.-S. (2013). Simulators of squamous cell carcinoma of the skin: diagnostic challenges on small biopsies and clinicopathological correlation. Journal of Skin Cancer, 2013, 752864.

11. Sreeram, S., Lobo, F. D., Naik, R., Khadilkar, U. N., Kini, H., & Kini, U. A. (2016). Morphological Spectrum of Basal Cell Carcinoma in Southern Karnataka. Journal of Clinical and Diagnostic Research : JCDR, 10(6), EC04-7.

12. Leiter, U., Eigentler, T., & Garbe, C. (2014). Epidemiology of skin cancer. Advances in Experimental Medicine and Biology, 810, 120–40.

13. Khandpur, S., & Ramam, M. (2012). Skin tumours. Journal of Cutaneous and Aesthetic Surgery, 5(3), 159–62.

14. MATLAB Official Webpage (https://es.mathworks.com/). Last time accessed: 4/01/2017