Selfee, self-supervised features extraction of animal behaviors

Jia, Yinjun; Li, Shuaishuai; Guo, Xuan; Lei, Bo; Hu, Junqiang; Xu, Xiaohong; Zhang, Wei

doi:10.7554/elife.76218

Cited by 19 publications

(19 citation statements)

References 86 publications

(132 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

Section: Methodsmentioning

confidence: 99%

“…The model can then be fine-tuned with small amounts of training data to be optimized for downstream tasks. Recently, contrastive learning has been applied for feature extraction from animal videos by Jia et al [25], by performing contrastive learning on the frame images directly in similar fashion to existing work in computer vision. Contrastive learning's goal is to learn representations of data such that similar datapoints are close to each other, while dissimilar ones are far apart, without the need for labels [24].…”

Section: Contrastive Learningmentioning

confidence: 99%

See 1 more Smart Citation

ConstrastivePose: A contrastive learning approach for self-supervised feature engineering for pose estimation and behavorial classification of interacting animals

Zhou

Cheah

Chin

et al. 2022

Preprint

View full text Add to dashboard Cite

In recent years, supervised machine learning models trained on videos of animals with pose estimation data and behavior labels have been used for automated behavior classification. Applications include, for example, automated detection of neurological diseases in animal models. However, there are two problems with these supervised learning models. First, such models require a large amount of labeled data but the labeling of behaviors frame by frame is a laborious manual process that is not easily scalable. Second, such methods rely on handcrafted features obtained from pose estimation data that are usually designed empirically. In this paper, we propose to overcome these two problems using contrastive learning for self-supervised feature engineering on pose estimation data. Our approach allows the use of unlabeled videos to learn feature representations and reduce the need for handcrafting of higher-level features from pose positions. We show that this approach to feature representation can achieve better classification performance compared to handcrafted features alone, and that the performance improvement is due to contrastive learning on unlabeled data rather than the neural network architecture.

show abstract

Section: Methodsmentioning

confidence: 99%

Section: Contrastive Learningmentioning

confidence: 99%

ConstrastivePose: A contrastive learning approach for self-supervised feature engineering for pose estimation and behavorial classification of interacting animals

Zhou

Cheah

Chin

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…This section provides a high-level overview of animal behavior classification frameworks for small laboratory animals. We outline a general taxonomy that organizes methods as supervised or unsupervised at a coarse level, and with varying degrees of supervision Taxonomy for Animal Behavior Classification Supervised Classification Hand-crafted Features, Behavior Labels [137], [33], [72], [20], [47], [79], [60], [101], [35], [45] Behavior Labels [91], [160], [63] Hand-crafted Features, Pose and Behavior Labels, PE [139], [145], [146], [4], [94], [121] Pose and Behavior Labels, PE [179] Optical Flow, Hand-crafted Features, Behavior Labels [161] Residual Learning, Optical Flow, Behavior Labels [14] Residual Learning, Pose and Behavior Labels, PE [178] Residual Learning, Optical Flow, Behavior Labels [105] Unsupervised Classification Hand-crafted Features, Pose Labels, PE [61] Fully Unsupervised [144], [11], [172], [8], [16], [73] Fig. 8.…”

Section: Taxonomy For Animal Behavior Classificationmentioning

confidence: 99%

CNN-Based Action Recognition and Pose Estimation for Classifying Animal Behavior from Videos: A Survey

Perez¹,

Toler‐Franklin²

2023

Preprint

View full text Add to dashboard Cite

Classifying the behavior of humans or animals from videos is important in biomedical fields for understanding brain function and response to stimuli. Action recognition, classifying activities performed by one or more subjects in a trimmed video, forms the basis of many of these techniques. Deep learning models for human action recognition have progressed significantly over the last decade. Recently, there is an increased interest in research that incorporates deep learning-based action recognition for animal behavior classification. However, human action recognition methods are more developed. This survey presents an overview of human action recognition and pose estimation methods that are based on convolutional neural network (CNN) architectures and have been adapted for animal behavior classification in neuroscience. Pose estimation, estimating joint positions from an image frame, is included because it is often applied before classifying animal behavior. First, we provide foundational information on algorithms that learn spatiotemporal features through 2D, two-stream, and 3D CNNs. We explore motivating factors that determine optimizers, loss functions and training procedures, and compare their performance on benchmark datasets. Next, we review animal behavior frameworks that use or build upon these methods, organized by the level of supervision they require. Our discussion is uniquely focused on the technical evolution of the underlying CNN models and their architectural adaptations (which we illustrate), rather than their usability in a neuroscience lab. We conclude by discussing open research problems, and possible research directions. Our survey is designed to be a resource for researchers developing fully unsupervised animal behavior classification systems of which there are only a few examples in the literature.CCS Concepts: • Computing methodologies → Neural networks.

show abstract

“…This approach has the benefit of not requiring the experimenter to specify rules for deriving output labels for training, at the cost of potentially making the process more sensitive to noise and occlusions (Hausmann et al, 2021 ; Luxem et al, 2022 ). Selfee (Jia et al, 2022 ) and BehaveNet (Batty et al, 2019 ), unlike VAME and DBM, operate directly on snippets of video rather than pose data from DLC/SLEAP, employing autoencoder-style training directly on video data for nonlinear dimensionality reduction. Selfee operates on short 3-frame snippets of raw video, has been used in mice to identify behaviors such as social nose contact and allogrooming in open field tests, and places its main emphasis on using the extracted feature space for a variety of downstream analyses (Jia et al, 2022 ).…”

Section: Machine Learning Approaches For Emotional Behavioral Analysismentioning

confidence: 99%

“…Selfee (Jia et al, 2022 ) and BehaveNet (Batty et al, 2019 ), unlike VAME and DBM, operate directly on snippets of video rather than pose data from DLC/SLEAP, employing autoencoder-style training directly on video data for nonlinear dimensionality reduction. Selfee operates on short 3-frame snippets of raw video, has been used in mice to identify behaviors such as social nose contact and allogrooming in open field tests, and places its main emphasis on using the extracted feature space for a variety of downstream analyses (Jia et al, 2022 ). BehaveNet, unlike the other self-supervised methods, performs feature extraction on individual frames rather than sequences of frames, and does not consider the temporal structure of the data until the discretization step, which uses an autoregressive hidden Markov model (Batty et al, 2019 ).…”

Section: Machine Learning Approaches For Emotional Behavioral Analysismentioning

confidence: 99%

Using deep learning to study emotional behavior in rodent models

Kuo

Denman

Beacher

et al. 2022

Front. Behav. Neurosci.

View full text Add to dashboard Cite

Quantifying emotional aspects of animal behavior (e.g., anxiety, social interactions, reward, and stress responses) is a major focus of neuroscience research. Because manual scoring of emotion-related behaviors is time-consuming and subjective, classical methods rely on easily quantified measures such as lever pressing or time spent in different zones of an apparatus (e.g., open vs. closed arms of an elevated plus maze). Recent advancements have made it easier to extract pose information from videos, and multiple approaches for extracting nuanced information about behavioral states from pose estimation data have been proposed. These include supervised, unsupervised, and self-supervised approaches, employing a variety of different model types. Representations of behavioral states derived from these methods can be correlated with recordings of neural activity to increase the scope of connections that can be drawn between the brain and behavior. In this mini review, we will discuss how deep learning techniques can be used in behavioral experiments and how different model architectures and training paradigms influence the type of representation that can be obtained.

show abstract

Selfee, self-supervised features extraction of animal behaviors

Cited by 19 publications

References 86 publications

ConstrastivePose: A contrastive learning approach for self-supervised feature engineering for pose estimation and behavorial classification of interacting animals

ConstrastivePose: A contrastive learning approach for self-supervised feature engineering for pose estimation and behavorial classification of interacting animals

CNN-Based Action Recognition and Pose Estimation for Classifying Animal Behavior from Videos: A Survey

Using deep learning to study emotional behavior in rodent models

Contact Info

Product

Resources

About