Using Machine Learning to Analyze Retinal Blood Vessel Growth in Retinal Flat Mount Images


As someone with poor vision, I always wanted to conduct research in the eye. When I first started researching in ophthalmology at the Hartnett Lab, I found that the lab was analyzing retinal images manually. They needed a way to quantify the effects of the disease retinopathy of prematurity. They were tracing out the blood vessels to measure the vessel density, which was very time-consuming. I decided to create a fully automated, machine learning based method to segment the retinal images. I tested this on several images of retinas, both normal and diseased images.

The method is the first successful automated way that is able to incorporate the use of machine learning to analyze the blood vessels in the retinal flat mount images. It replaces the time-consuming process of manual segmentation. The use of such a tool will increase the efficiency and standardize measurements between individuals and across laboratories. No existing methods are as accurate and efficient as the one that developed here. 

Those who are studying ophthalmology will be able to use this research to more effectively determine the extent of that the disease retinopathy of prematurity affects the retina. This will allow them to develop new treatments for the disease. This method allows new drugs to potentially be developed faster by pharmaceutical researchers to treat retinopathy of prematurity. I am looking to expand this tool to automatically segment other retinal features such as branching angle and apply it retinas that have been affected by other diseases.

Question / Proposal

The retina, the back wall of the eye that takes in the light we see, has many blood vessels in it. These blood vessels can help us identify diseases based on their specific characteristics of growth. This is because certain diseases affect the development and growth of the blood vessels in the eye.

However, current methods of analyzing these blood vessels are manual. Manual markings are often inconsistent and unreliable when quantifying relevant parameters to vessel development. In addition, the manual methods requiring extensive tracing in the images, producing laborious works and making it virtually impossible to analyze large sets of data. This lead me to wonder if there was a more efficient way to analyze these images. I wanted to create a fully automated method to analyze and quantify the retinal flat mounts so the diseases could be identified faster and more treatments could be created. 

After researching online, I discovered many papers that were analyzing medical images using machine learning, and I became drawn to its power and efficiency. I spent some time reading and learning these techniques and decided to see if I could apply machine learning concepts to analyze the images. The algorithm had to be able to measure all 4 parameters of interest: avascular area, total retinal area, avascular area percentage, and retinal vascular density. Due to the success of these past papers, I expected that the algorithm I designed would be able to perform nearly the same as the manual methods and also faster.


Retinopathy of prematurity is a potentially blinding disorder of retinal vascular development that occurs in preterm human infants who are born prematurely and do not have fully developed retinal blood vessels. This is often due to a lack of oxygen or nutrients supplied to the eye. The disorder is one of the most common causes of childhood blindness and can lead to long-term visual loss and impairment. Currently, 14,000 to 16,000 out of the 3.9 million infants in the United States are affected by retinopathy of prematurity, and 400 to 600 infants become blind each year from the disease.

Retinopathy of prematurity occurs in the retina of the eye. It arises when abnormal blood vessels growth, such as vascular proliferation, causes the vessels to leak and swell. This can scar the retina of the eye and cause retinal detachment, or when the retina is pushed away from the blood vessels that supply its nutrients and oxygen. 

Several circumstances will influence the growth of these abnormal blood vessels, especially during the last 12 weeks of pregnancy. During this interval, the eye begins to develop into its final structure. However, if an infant is born prematurely, it does not receive the full amount of oxygen and nutrients needed to fully develop the eye. This causes the blood vessels that supply the oxygen to the eye to start growing abnormally, often causing the vessels to be more spaced out and also not reach the edges of the retina. The gaps that form between the blood vessels are known as lacunae, which are a clear indicator of the disease, while the area between the blood vessels and the edge of the retina is known as the avascular area. The disease can also cause the blood vessels to grow upward, pushing into the retina, causing it to potentially detach. This detachment also pushes the retina away from the optic nerve, making it difficult for the brain to receive the information about the light traveling into the eye, which causes visual impairment or even blindness.

Typically, the disease is modeled with a retinal flat mount from rats. These are prepared by cutting open the retina laid on a flat surface to produce retinal flat mounts. Animal models allow for the analysis of the role of cell-environment interactions and signaling events that determine retinal vascular development in retinal research. Analyzing preterm human infants would be unsafe and logistically impractical. Several animal models exist and all involve exposing newborn animals to supplemental oxygen under various protocols to inflict a condition known as oxygen-induced retinopathy. This condition has been used to model retinopathy of prematurity in previous literature.

Current methods to analyze these retinal flat mounts are manual. Manual markings are often inconsistent and unreliable when analyzing retinal vascular density, avascular area, and total retinal area. In addition, the manual methods requiring extensive tracing in the images, producing laborious works and making it virtually impossible to analyze large sets of data. An automated method would remedy these issues.

Method / Testing and Redesign

A computer algorithm was designed to automatically analyze the retinal flat mounts. A total of twenty retinal flat mount images was acquired from the John A. Moran Eye Center for analysis (10 normal and 10 diseased). Images were approximately 1 to 2 GBs in size and 7000 pixels by 7000 pixels each. An example is shown below.

Each image varied in brightness, initially causing the algorithm's analysis to be inaccurate. This was remedied by normalizing each image’s brightness to have all the images the same brightness. On top of having the range of each image's brightness standardized, each image was also processed with the CLAHE (Contrast Limited Adaptive Histogram Equalization) algorithm to increase local contrast and make the blood vessels easier to detect. Afterwards, each image was applied a common blurring algorithm known as the Gaussian Blur. By blurring, a connected components algorithm (which finds all regions with similar pixel intensity above a brightness threshold) could be later applied to remove all the unnecessary parts of the image.

Each of these images were then segmented through Trainable Weka Segmentation. A classifier and data set were first defined in order for Trainable Weka Segmentation to work. The classifier was defined to have three classes — vessels, nonvascularized regions, and background pixels. These three represented their respective element of the image. The 8 training features used in the classifier were set to be Gaussian blur, Sobel filter, Hessian, Difference of Gaussians, Variance, Derivatives, Structure, and Neighbors. Multiple traces of examples of vessels, nonvascularized regions and background were inserted to each respective class, defining the data set, as shown below.

The images took too much memory to run on the laptop because of their large size, so they were divided into a groups of 64 images (8 horizontally and 8 vertically). Each of these 64 images were fed into the algorithm individually to keep each processed image length and width under 1024 x 1024 pixels, Trainable Weka Segmentation’s suggested maximum picture size. This solved the problem of the program be unable to run on a computer with less memory capabilities at a reasonable speed. 

The algorithm was then able to use the learned data to segment the test image into its vessel, nonvascular, and background areas. This allowed 4 parameters to be measured: total retinal area, avascular area, avascular area percentage, and retinal vascular density. Total retinal area was the area of the vessels and nonvascular together. Avascular area in retinal flat mounts was the areas of the flat mount the blood vessels did not reach, shown below. Avascular Area Percentage was calculated as the total avascular area in pixels divided by the total retinal area. Retinal vascular density was the ratio of the total area covered by blood vessels in a flat mount and the total area of the retina.

Results were compared against manual markings that were performed 3 times by different people on the same retinal flat mount images. They provided a verification of the accuracy of the program's measurements.


Results of the algorithm were obtained for four parameters: total retinal area, avascular area, avascular area percentage, and retinal vascular density.

To determine if the manual measurements were accurate, the inter-rater intraclass correlation was calculated between three readers and the intra-rater correlation, repeat measurements by the same grader, were calculated for total retina area, total avascular area, and percent avascular area. This established a gold standard to compare the algorithm to. The results for these calculations are shown below the table below.  

The table below provides a comparison across the same three image parameters between the algorithm and the gold standard, which was the average of the first three manual measurements. Intraclass correlation coefficient values with 95% confidence intervals, Pearson correlations, and linear regression formulas were calculated for the comparison between the algorithm and the gold standard, which was the average of three separate human measurements. 

It is possible that area measurements may be highly correlated but still inaccurate if an algorithm consistently misclassifies a proportionate amount of each image. To evaluate this possibility, the precision and recall were calculated for the algorithms’ classifications of each pixel in the gold standard dataset. The results of this evaluation along with the overall dice similarity coefficients are shown in the graph below.

As a final evaluation step, a post hoc direct visual comparison between algorithm output and manual tracings was performed. Representative images of the output of the algorithm are shown in the two figures below for for peripheral avascular area and total retinal area respectively.

To help analyze the trends in algorithm output, a Bland-Altman plot for the algorithm, which depicts differences between expert measurements and algorithm output across the range of peripheral avascular area present in the gold standard dataset was constructed, as shown in the figure below. The positive slope of the line of best fit in the plot for this data indicates that the algorithm underestimates avascular area in retinal images with low avascular area and overestimates the avascular area in retinal images with large avascular areas. 

The calculated results for both the manual and automatic measurements of the retinal vascular density were obtained after taking the mean of the results across the 10 diseased retinal flat mount and 10 normal retinal flat mount images.


These results showed that the algorithm performed consistently with both normal and diseased images when measuring retinal vascular density. Conducting two 2 sample t-tests resulted in p-values that were above 0.10, suggesting that the algorithm did not perform statistically different than the manual measurements. The lower standard deviation of the algorithm (3.3% and 3.2% vs 4.2% and 3.2% for manual) also suggests that it was more consistent than the manual methods of analysis for retinal vascular density. 


From the results, it was clear that the method developed is an a robust and accurate enough to measure the extent to which retinopathy of prematurity affects the retina. The algorithm achieves an excellent average intraclass correlation of 0.94 when compared with human measurements from a gold standard dataset. The segmentations themselves visually appear more intricate than manual segmentations, and the only necessary input for the algorithm is a lectin-stained retinal flatmount image. 

To gauge a more accurate evaluation of the algorithm, more images will need to be analyzed. Images that have been marked for the severity of retinopathy of prematurity could be used to train the algorithm to evaluate the extent of the disease in the future. 

One source of error came from the manual methods used to evaluate the parameters that were measured. Every grader's marking of the retinas is different, and comparing the manual methods as a gold standard to the algorithm contributed to some of the error. However, the manual markings were the best approach to compare against the algorithm's results. Sources of error also came from imperfect removal of noise in the image or incorrect removal of the vessels, as shown in the respective figures below.

One limitation of the algorithm is that it misses very small avascular areas in some leaflets that fall below the size filter that was used to exclude the lacunae between vessels. Often these smaller areas appeared due to tears in the leaflets, debris that lied in the avascular area, or blood vessels growing very close to the edge of the retina. 

However, it is important to note that the algorithm's results were not statistically different from those of the manual markings, suggesting that both methods were of reasonable accuracy and can be used to evaluate the extent to which retinopathy of prematurity affected the retinal blood vessels. The ICC of the avascular area was also very high suggesting that the algorithm developed is a robust method to evaluate the parameter.

The outputs of the methods can be used in the future to help identify retinopathy of prematurity. More vascular parameters will also be analyzed, such as the branching angle of the retinal vessels. Future steps in the development of retinal image analysis could include directly segmenting and fully removing the ciliary body when it is present since this structure is inconsistently present in retinal flat mount images and was frequently a source of error in algorithm's measurements. 

An effective machine-learning-based and fully automated method for segmentation of retina images from retinal image models was presented in this research. The method achieves nearly the same accuracy as manual methods of analysis and can replace the time-consuming process of manually segmenting retinal images. The results suggest that the algorithm is more consistent in its measurements than the manual methods. The use of such a tool will increase the efficiency and standardize measurements between individuals and across laboratories.

About me

I grew up with bad eyesight, so I always wondered how I could improve my eyesight. I began researching eyes and came across diseases and abnormalities like diabetic retinopathy and corneal arcus. I was surprised to see how many people these diseases affected, like how age-related macular degeneration currently affects almost 11 million in the US and is expected to double by 2050. I wanted to solve these problems, so I decided to start researching into this area and use my interest in coding the help create solutions through the algorithms I develop. 

I intend to go to a college that has a leading biomedical program and a strong computer science department. I want to be well versed in both the medical and computer science fields and aim to be a contributor and leader in the technological revolution that is occurring. I admire Manu Prakash, who was able to create an optical microscope for less than $1 to educate students all around the world. I hope to apply the same type of innovative thinking to reduce the cost of healthcare in the future.

Winning a prize in Google Science Fair would give me a chance to go to college and take the stress away from finances. I would be able to focus on pursuing my interests without worrying about how I can pay for the next semester. Winning would also validate the scientific research I have been doing and the progress I have since my freshman year.

Health & Safety

This investigation had no involvement in the manipulation of any of the animals used and instead used images of rats from the Hartnett lab at the John A. Moran Eye Center of the University of Utah. Development of the algorithm was done in my home was not conducted in the lab itself. Because the project was writing code on a laptop to design the algorithm, there was nominal risk involved. Therefore, minimal safety precautions were taken.

Mentor, Dr. Mary Elizabeth Hartnett.


Bibliography, references, and acknowledgements

I would like to thank Dr. Silke Becker for supplying the images and her assistance with the biology portion of this research. I give my thanks to Michael Simmons and Dr. Richard Gerkin for their assistance with statistical analysis and drawing conclusions. Finally, I would like to thank my mentor, Dr. Mary Elizabeth Hartnett, for her supervision and mentorship throughout this research process.

I contacted Dr. Hartnett to start researching in her lab. She had an issue in her lab where the lab members were marking out the retinal flat mount images manually by hand. We decided that it was inefficient and wanted to create a faster automated method to do this.

The decision to create an automated machine learning based method was mine. I researched, selected, designed, and tested all the image processing filters and plugins I used in the algorithm I created. The process involving the rats used to create the images, which iwas not focused on in this research paper, was decided by the lab. Manual markings were designed by the lab group to be done by three other people independently to reduce bias.

The automated method to analyze the images and all its components — including the machine learning, connected component, and image processing algorithms were all selected and implemented by me. 

The 10 images were obtained from rats who had undergone oxygen-induced retinopathy by Dr. Silke Becker. I recorded all the data and results from the algorithm I designed. The manual results were measured by other lab members to prevent bias.

Data analysis presented in this investigation submitted was done by me with the help of Dr. Richard Gerkin, a professional statistician, and Michael Simmons, a lab member. I was the one who performed the F1 tests and used the t-tests and normalcy tests. The ICC, regression, and dice coefficients presented were with the help of Dr. Richard Gerkin. The error analysis on coding, computing, and machine learning algorithms were solely by me. The error analysis on data collected was done in conjunction with the other lab members.  

The conclusions based on the data analysis presented in this investigation submitted were done by me with the help of Dr. Richard Gerkin and Michael. The conclusions in this investgation primarily reflect the nature of error my method had, so I was able to use my data analysis to conclude that the method I developed was accurate and could be used in place of the current manual methods. The final conclusions were verified by the lab group.

This research project was done as a part of a paper that was submitted and provisionally accepted to Molecular Vision. The paper was with the aforementioned people. However, the research I submitted here focuses on the design of the algorithm and not on the process involving the rats used for the creation of the images that were analyzed in the paper submitted to Molecular Vision. The research I present here analyzes the performance of the algorithm and its errors compared to the manual methods more in-depth.


B. Aliahmad, D. K. Kumar, and R. Jain. Automatic analysis of retinal vascular parameters for detection of diabetes in indian patients with no retinopathy sign. International scholarly research notices, 2016, 2016.

Bio Ninja. E2 perception of stimuli., 2018. [Online; accessed September 27, 2018].

E. Ivanova, A. H. Toychiev, C. W. Yee, and B. T. Sagdullaev. Optimized protocol for retinal wholemount preparation for imaging and immunohistochemistry. Journal of visualized experiments: JoVE, (82), 2013.

H. E. Grossniklaus, S. J. Kang, and L. Berglin. Animal models of choroidal and retinal neovascularization. Progress in retinal and eye research, 29(6):500–519, 2010.

H. Wang, Z. Yang, Y. Jiang, J. Flannery, S. Hammond, T. Kafri, S. K. Vemuri, B. Jones, and M. E. Hartnett. Quantitative analyses of retinal vascular area and density after different methods to reduce vegf in a rat model of retinopathy of prematurity. Investigative ophthalmology & visual science, 55(2):737–744, 2014.

I. Arganda-Carreras, V. Kaynig, C. Rueden, K. W. Eliceiri, J. Schindelin, A. Cardona, and H. Sebastian Seung. Trainable weka segmentation: a machine learning tool for microscopy pixel classification. Bioinformatics, 33(15):2424–2426, 2017.

J. D. Akula, T. L. Favazza, J. A. Mocko, I. Y. Benador, A. L. Asturias, M. S. Kleinman, R. M. Hansen, and A. B. Fulton. The anatomy of the rat eye with oxygen-induced retinopathy. Documenta ophthalmologica, 120(1):41–50, 2010.

J. M. Barnett, S. E. Yanni, and J. S. Penn. The development of the rat model of retinopathy of prematurity. Documenta ophthalmologica, 120(1):3–12, 2010.

J. S. Penn, M. M. Henry, and B. L. Tolman. Exposure to alternating hypoxia and hyperoxia causes severe proliferative retinopathy in the newborn rat. Pediatric research, 36(6):724, 1994.

M. E. Hartnett. Pathophysiology and mechanisms of severe retinopathy of prematurity. Ophthalmology, 122(1):200–210, 2015.

M. E. Hartnett. The effects of oxygen stresses on the development of features of severe retinopathy of prematurity: knowledge from the 50/10 oir model. Documenta ophthalmologica , 120(1):25–39, 2010.

National Institutes of Health. The early treatment for retinopathy of prematurity study (ETROP)., 2018. [Online; accessed September 9, 2018].