Revolutionizing Great Ape Behavior Analysis: Visual Deep Learning Meets Primatology

Home AI News Revolutionizing Great Ape Behavior Analysis: Visual Deep Learning Meets Primatology

Revolutionizing Great Ape Behavior Analysis: Visual Deep Learning Meets Primatology

Introduction
Background and Motivation
Objectives of the Project
Introduction to the Dataset
Deep Metric Learning for Behavior Classification
Model Architecture
Results and Analysis
Human in the Loop System
Active Learning Techniques
Selection of Data Samples
Visualizing the Learning Space
Conclusion and Future Work

️️️️️️️️️️Article

1️⃣ Introduction

Hi everyone! My name is Otto, and I'm working on my PhD thesis titled "Visual Deep Learning Meets Primatology: Understanding Great Apes' Behaviors Computationally." In this project, I aim to explore the use of deep metric learning and active learning techniques to analyze video footage of great apes' behaviors. With many great ape species facing endangerment, it is crucial to develop efficient and scalable methods for behavioral analysis. Traditional manual analysis by experts in the field is time-consuming, and with dwindling time, the use of artificial intelligence (AI) can provide a solution to automate this process effectively.

2️⃣ Background and Motivation

The motivation behind my research project Stems from the fact that several great ape species are either endangered or critically endangered. To ensure the success of conservation efforts and gain a deeper understanding of their behaviors, it is essential to analyze large-Scale behavioral data. Traditionally, this analysis has been carried out manually by conservation biologists, ecologists, and primatologists, consuming substantial time and effort. However, given the projected decline of great ape species in the next few decades, there is an urgent need for a more efficient and automated approach. This is where AI and deep learning come into play, offering the potential to analyze behaviors at scale and speed.

3️⃣ Objectives of the Project

In this project, we have two main objectives. The first is to build a classifier for great ape behaviors in videos using a deep metric learning approach. We want to train a network that can embed inputs onto a metric space in such a way that similar representations are close to one another, and dissimilar representations are distant. This will enable us to classify behaviors accurately. The Second objective is to prototype a human in the loop system using active learning techniques. By incorporating human knowledge into the learning process, we aim to optimize model performance and enhance the overall quality of behavioral analysis.

4️⃣ Introduction to the Dataset

To conduct our research, we utilize the Pan-African dataset, which consists of approximately 20,000 videos of great apes. Luckily, we have access to a subset of 500 videos that come with fine-grained annotations, including information on location, species, and behavior. These annotations provide us with a foundation to begin our analysis. To make the behaviors more concrete, we extract still images from the videos that depict various actions performed by the apes. These actions include camera interaction, climbing up and down, hanging, running, sitting, standing, and walking.

5️⃣ Deep Metric Learning for Behavior Classification

Deep metric learning serves as the backbone of our behavior classification approach. The architecture we employ consists of two streams: a Spatial stream for RGB images and a temporal stream for optical flow images. We use the ResNet18 CNN architecture for feature extraction and an LSTM for capturing long-range dependencies between data. The output from both streams is concatenated and passed through fully connected layers, resulting in a 128-dimensional embedding space. This embedding space is optimized during training, ensuring that similar behaviors are represented closely together.

6️⃣ Model Architecture

In our model architecture, we adopt a triplet approach for deep metric learning. Triplets consist of an anchor, a positive example (with the same class label as the anchor), and a negative example (with a different class label). The learning objective is to minimize the distance between the anchor and positive examples and maximize the distance between the anchor and negative examples. We experiment with different techniques such as semi-hard negative mining and hard negative examples to select informative triplets during training. However, random sampling of triplets yields the best performance for our specific task.

7️⃣ Results and Analysis

After training the model with various configurations, we observed that the semi-hard negative mining technique showcased a short increase in performance early on but plateaued later. On the other HAND, hard negative mining did not perform as well. Surprisingly, random sampling of triplets proved to be the most effective in achieving the desired performance. Although we acknowledge the need for a more thorough analysis of triplet selection strategies, these preliminary results highlight the potential of deep metric learning in behavior recognition.

8️⃣ Human in the Loop System

To simulate a human in the loop system, we employ active learning techniques. Active learning aims to select data samples from an unlabeled pool that would provide the most information if labeled. In our case, we partition the existing training set into an active training set and an active holdout set. The active holdout set serves as a pool of unlabeled data, and by applying Supervised and unsupervised models to the embedded representations, we rank the samples based on uncertainty. We then select the most uncertain samples for annotation.

9️⃣ Active Learning Techniques

In our exploration of active learning, we compare two sampling strategies: uncertainty sampling and diversity sampling. Uncertainty sampling involves selecting the most uncertain samples from the active holdout set, whereas diversity sampling focuses on selecting samples that span across different classes. We experimented with different ratios of additional data and observed that uncertainty sampling performs better with a smaller labeling budget, while diversity sampling shines when a larger labeling budget is available. These findings underscore the importance of selecting the right active learning strategy based on the specific requirements of the project.

🔟 Selection of Data Samples

The key challenge in active learning is the selection of data samples that would contribute the most to model performance if labeled. By projecting the active holdout set into the metric space learned by our model, we can generate prediction confidences using supervised and unsupervised approaches. We then apply an entropy uncertainty measure to rank the unlabeled data points. Both uncertainty and diversity sampling strategies were evaluated, and their performance varied based on the proportion of additional data. It is important to note that further research and analysis are required to fine-tune these strategies and better understand their strengths and limitations.

1️⃣1️⃣ Visualizing the Learning Space

One of the intriguing aspects of deep metric learning is the ability to Visualize the learned metric space. Using techniques like t-SNE, we can plot the training samples in two Dimensions and gain insights into the clusters formed by different behaviors. The visualizations provide an intuitive representation of the behaviors and their relationships. For instance, we can observe dense clusters for camera interaction and distinct clusters for behaviors associated with hanging, such as climbing up, climbing down, and hanging. These visualizations serve as a valuable tool for conveying results to experts in the field and aid in interpreting the model's performance beyond traditional metrics.

1️⃣2️⃣ Conclusion and Future Work

In conclusion, our project demonstrates the viability of deep metric learning for behavior recognition in great apes. The visualizations offer an interesting perspective and showcase the potential benefits of this approach in communicating results to experts from different domains. However, there is still a need for more rigorous analysis and improvement of the current streams and techniques used. Additionally, the current research focuses on a limited set of core behaviors, and future work will expand this repertoire and refine the models further. This project opens up new avenues for incorporating AI into the field of conservation biology and contributes to the ongoing efforts in understanding and protecting great ape species.

️️️️️️️️️️Highlights

Deep metric learning and active learning techniques are revolutionizing the analysis of great ape behaviors.
The project aims to automate the behavioral analysis of endangered great apes using AI and deep learning.
The Pan-African dataset provides a valuable resource for studying great ape behaviors, with fine-grained annotations.
Deep metric learning enables us to embed inputs onto a metric space, facilitating accurate behavior classification.
Active learning techniques, such as uncertainty and diversity sampling, enhance the training process by selecting informative data samples.
The visualization of the learned metric space offers insights into the clustering of behaviors, aiding interpretation.

❓FAQ

Q: Did you consider other dimensionality reduction techniques? A: Although our current research relies on t-SNE for visualizing the learning space, we acknowledge the limitations of this technique. In future work, we plan to explore alternative dimensionality reduction techniques to obtain more robust visualizations.

Q: Are you planning to incorporate other modalities, such as audio, in your analysis? A: While we have not incorporated audio data in the current project, we are aware of its potential importance in action recognition. We will consider incorporating audio modalities in future research to enhance the accuracy and comprehensive understanding of great apes' behaviors.