Positively Identifying Species Using Convolutional Neural Networks and Hypernetworks to Aid Wildlife Conservation Efforts

Summary

Positively identifying species, particularly endangered ones, have been a challenge in wildlife conservation efforts. Traditional methods require expert knowledge. Convolutional Neural Networks (CNNs) can identify images of species, but have limitations with lack of training images for rare/endangered species. I have developed a solution using hypernetworks to overcome this challenge.

Initially, several images of common species are used to train CNNs that can identify one species each. A single image of each of the species is then paired with its corresponding CNN to train the hypernetwork.

The hypernetwork is given a single image of a rare species, not part of the training set, to generate a CNN which can identify other images of that rare species.

To test the effectiveness of the solution, two tests were conducted. The first involved a butterfly image dataset, and the CNNs produced by the hypernetwork had an accuracy of 70.5%. The second, using a much larger dog breeds dataset, resulted in an accuracy of 90%.

The test results proved that this solution can be used to classify species with very small image set. By combining the images of footprints, horns, hair, and feces of various species for training, identification accuracy can be improved further. It can be useful and effective in curbing poaching and illegal trafficking, wildlife population and diversity analysis. It can be deployed on mobile devices for remote offline fieldwork as well.

Question / Proposal

Positively identifying species has been a challenge in wildlife conservation. Identifying species is important for curbing poaching, stopping illegal trafficking, monitoring wildlife population and diversity analysis. Besides conservation, species identification could be useful for the general public to identify potentially valuable plants, possibly with important medicinal applications.

To identify species, methods that take long durations of time for processing, such as DNA analysis are traditionally used. Human experts are capable of quick identification, but there is not a sufficient number of them to meet requirements.

Convolutional Neural Networks (CNNs) have shown promise in classifying images with great accuracy. These can be applied to solve the aforementioned challenge by classifying images of plants and animals.

However, CNNs require large amounts of training data, which is not available for rare species, which are among of the most important in conservation efforts.

My project aims to solve the issue of a lack of training images using hypernetworks, to introduce new ways to use CNNs to identify species by using artifacts, and to enable offline use.

My hypotheses include:

  1. Convolutional Neural Networks are capable of identifying/classifying species from images of an organism or its artifacts (horns, feathers, footprints, feces, etc.)
  2. A hypernetwork, given a sufficient number of training examples in the form of neural networks and images (for common species), can identify common features and properties of different species, while also understanding the unique differentiating visible features each species.

This application could be useful for scientists, conservationists, academia, and the general public.

Research

Initial stages of this project simply involved using traditional Convolutional Neural Networks (CNNs) to identify species. CNNs have been proven to work very well for image classification, as networks like Inception and various other entries of the ImageNet Challenge have shown.

CNNs have also been in use to identify images of species. Various apps implemented these for different applications. Many were geared towards only plant identification, while others could work with both plants and animals. There have also been scientific papers written on species identification using neural networks, but most were not well-rounded applications that could be deployed.

Altogether, all the currently available solutions using CNNs to identify species lacked two or three of the following features that are very important to solve the problem this project aims to address:

  1. Most of these applications were not geared towards wildlife conservation. Most were targeted at non-conservationists and the neural networks used in the applications were trained on species that were relevant to this goal. They lacked the capability of identifying species that are important for wildlife conservation.
  2. The currently available applications did not work offline. Offline access is crucial because most identification for conservation is done in remote areas such as forests.
  3. While not being crucial for species identification, being able to use CNNs to identify not just images of organisms, but also their artifacts is a very useful feature. These 'artifacts' include feces and footprints.

This project includes all the aforementioned features.

A key issue with using CNNs to identify images of species is that they need several training samples to work properly. There are many rare species, important for research and conservation of which very few images have ever been taken. For many, only one picture has ever been taken. Such a limited number of training samples will render methods like transfer learning  No currently available image-based species classification application offers the ability to identify images of these species because they all utilize traditional CNNs.

There have been techniques and architectures developed to make neural networks able to learn with little data (even one sample), but all of these do not generalize well, do not have the required accuracy, aren't capable of processing images of high resolutions, and/or are innefficient upon deployment.

A novel method of using hypernetworks was developed to solve this problem. Hypernetworks have been considered in the past, but have never been used to solve the aforementioned problem. These hypernetworks mostly had managerial roles - creating neural networks to help complete small parts of a larger task.

Method / Testing and Redesign

This project can be distinctly divided into three phases in chronological order:

Phase 1

Phase 1 of the project involved using traditional CNNs to classify images of species. In this case, an InceptionV3 classifier was modified and retrained on the iNaturalist dataset consisting of 437,513 training images for 8,142 species for 75 epochs. On the accompanying iNaturalist test dataset, the classifier had an accuracy of 86%.

The ability to download the neural network for offline use was added to an application. This application provided a UI to use the neural network.

Phase 2

The second step was to implement the ability to identify animals based on 'artifacts', such as footprints, feces, etc.

Images of footprints and feces were obtained from image search engines. These were used to make training, validating, and testing datasets by splitting the obtained data. Images of feces and footprints for eight different groups of animals were collected. Two separate IncpetionV3-based CNNs (one for feces, the other for footprints) were trained on their respective sets of training data for 50 epochs. The feces-based network had an accuracy of 86% on its respective test dataset, and the footprint-based network had an accuracy of 73% on its respective test dataset.

Offline access for these networks was also added to the application.

Phase 3

After further investigation into the problem, it was found that many species lack a sufficient number of training examples. Most of these species were endemic and endangered - they were amongst the most important that have to be classified. For many of these important species, only one image has ever been taken of them. Attempting to get more data is futile, as the data simply does not exist for these rare species.

 A novel method of using hypernetworks was conceived. Hypernetworks are neural networks that output the parameters of another neural network. In this case, the goal is that the hypernetwork uses one image of a rare species as an input and the hypernetwork outputs a CNN that can identify any other image of that rare species.

Just like any other neural network, training samples are required. Traditional CNNs are first trained on several images of common species. Each of these CNNs can identify one common species. Each CNN is paired with a single image of the species it can identify to form training samples, used to train the hypernetwork.

Tests

Two tests were conducted. The first test considered the Leeds Butterfly Dataset which consisted of images of 10 different species. 8 of these were used for training and 2 for testing. On the two test cases, the hypernetwork had an average accuracy of 70.5%.

This merely suggested that the idea might work. To prove it conclusively, a second test using the Stanford Dogs dataset was conducted. This dataset had images of 110 dog breeds - 100 of which were used for training and 10 for testing. The CNNs produced by the hypernetwork for had an average accuracy of 89.9% on the 10 test cases.

Results

The key results of this project are from the experiments for phase 3 to show that hypernetworks are reliable when training data is insufficient.

In the first experiment, where butterfly species were considered, the two species used for testing were Vanessa atalanta and Vanessa cardui. The CNNs produced by the hypernetwork were tested on images of Vanessa atalanta, Vanessa cardui, various other butterfly species, random objects, and noise. When the hypernetwork was given one image of Vanessa atalanta, the outputted CNN correctly identified images of Vanessa atalanta 84% of the time. This is called the 'positive accuracy'. The CNN could correctly identify that the other images were not of Vanessa atalanta 62% of the time. This is called the 'negative accuracy'.

An average is taken the two scores, called the 'average accuracy', or simply accuracy, which was 73% for Vanessa atalanta. The test involving Vanessa cardui resulted in an average accuracy of 68%. The overall accuracy of the CNNs produced by the hypernetwork for this test was 70.5%.

The second experiment was far more comprehensive and used the same accuracy metrics and testing methods as the first test, but involved more test cases. It also involved far more training examples.

The increased number of training examples drastically increased the accuracy of the CNNs generated by the hypernetwork. The hypernetwork had an average accuracy of 89.9% on the 10 dog breeds used for testing, after being given only one image of each. This is significantly higher than the 70.5% accuracy from the first experiment.

The 10 breeds used for testing were Brabancon Griffon, Pembroke, Cardigan, Poodle, Miniature Poodle, Standard Poodle, Mexican Hairless, Dingo, Dhole, and African Hunting Dog. The hypernetwork was not exposed to these breeds before testing. The 10 CNNs produced by the hypernetwork for each dog breed were tested a common test set, containing 1,600 images under the following labels:

  • Images 0 - 11: Noise, Images of random objects and animals
  • Images 12 - 153: Images of Brabancon Griffon
  • Images 153 - 334: Images of Pembroke
  • Images 334 - 500: Images of Cardigan
  • Images 500 - 651: Images of Toy Poodle
  • Images 651 - 806: Images of Miniature Poodle
  • Images 806 - 965: Images of Standard Poodle
  • Images 965 - 1,120: Images of Mexican Hairless
  • Images 1,120 - 1,276: Images of Dingo
  • Images 1,276 - 1,426: Images of Dhole
  • Images 1,426 - 1,600: Images of African Hunting Dog 

The following graphs are the outputs of the hypernetwork produced CNNs for each training example. The threshold bar for the hypernetwork was set at 0.5, represented by the red dotted line. A comparative test was also done with traditionally trained CNNs - only trained on one image and a few negatives. This was done to compare the hypernetwork method with a traditional CNN. The threshold line for the traditional CNNs was set at 0.3 because their outputs were very low.

The comparative accuracies of the CNNs produced by the hypernetwork and traditionally trained CNNs is shown by the bar graph below

Conclusion

The various tests performed have shown that the solutions proposed are valid:

  • Convolutional Neural Networks are capable of identifying images of species with reasonable accuracy. They are also able to identify images of footprints and feces.
  • Hypernetworks are capable of species identification from just one training sample, and are significantly better than traditionally trained CNNs. Their consistent reliability (> 80% accuracy) on all test cases (in the second hypernetwork experiment) have proved this.

The primary drawback of hypernetworks is that they take a very long time to train - nearly a week on a home PC. However, this problem is likely to be solved with cloud computing and does not carry over upon deployment.

Future Work

Increased Training Time and Computational Power

Now that hypernetworks have been shown to reliably work, they need to be deployed in the application for use. A server must be used for training hypernetworks and generating CNNs. The outputted CNNs could then be stored offline.

The hypernetworks will thus be trained on higher performance machines than the ones used in the experiments. The traditional CNNs can also benefit from more time and computational power for training.

NLP-based identification

In many cases, it is not possible to take an image of an organism. Instead, only a textual description is possible to obtain. In this case, NLP can be used to process a textual description to help guide a user to the correct species.

Depth Mapping

Many modern smartphone cameras offer 3D depth mapping. This entire additional layer of data can be used with an image to get much higher accuracies and gain more insights.

Identification with Sketches

As previously mentioned, it is not always possible to take an image of an organism. In this case, neural networks can be trained to be able to identify sketches and images. Hypernetworks can learn to also generalize with sketches.

 

About me

For the past few years, my main interests have been Artificial Intelligence, astrophysics, and space science & exploration. The prospects of truly intelligent machines and venturing out into the cosmos are truly exciting. These are things that represent a key stepping stone in our species' future, and I think it is incredible to live in such an era.

Besides these, I have some interest in various other scientific disciplines, as nearly everything science has to offer is fascinating.

My interest in robotics was sparked by Lego Mindstorms in second grade, which brought me into computer science. This eventually brought me into machine learning, which is the area of computer science that I am most interested in.

I've also been concerned, like most people, with the environmental damage our species has caused. This has led me to apply my interest in Artificial Intelligence to help solve some of these problems. I hope that through the work I've done in this project I can make a small contribution to wildlife conservation efforts. The efforts our species is taking in wildlife conservation will help reverse environmental damage.

Some of the people who I admire very much are Carl Sagan, Alan Turing, and Richard Feynman.

For college, I wish to study Artificial Intelligence and attempt a double major with Astrophysics.

Winning this prize would aid me in being able to put this application into use to solve the problems that I am attempting to tackle. It would be a terrific opportunity and experience.

Health & Safety

This project does not have a physical component for experimentation - all experimentation was done virtually. Therefore, no specific health & safety requirements were necessary.

Bibliography, references, and acknowledgements

  • Dr. Sanjay Molur of Zoo Outreach Organization for exposing the conservation problem and subsequent feedback.
  • The iNaturalist dataset
  • C. Szegedy et al., "Going deeper with convolutions", 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, 2015. , pp. 1-9. doi: 10.1109/CVPR.2015.7298594
  • Karen Simonyan, Andrew Zisserman "Very Deep Convolutional Networks for Large-Scale Image Recognition" arXiv:1409.1556
  • Josiah Wang, Katja Markert, and Mark Everingham "Learning Models for Object Recognition from Natural Language Descriptions" In Proceedings of the 20th British Machine Vision Conference (BMVC2009)