Motivation Innovative microfluidic systems carry the promise to greatly facilitate spatio-temporal analysis of single cells under well-defined environmental conditions, allowing novel insights into population heterogeneity and opening new opportunities for fundamental and applied biotechnology. Microfluidics experiments, however, are accompanied by vast amounts of data, such as time series of microscopic images, for which manual evaluation is infeasible due to the sheer number of samples. While classical image processing technologies do not lead to satisfactory results in this domain, modern deep learning technologies such as convolutional networks can be sufficiently versatile for diverse tasks, including automatic cell counting as well as the extraction of critical parameters, such as growth rate. However, for successful training, current supervised deep learning requires label information, such as the number or positions of cells for each image in a series; obtaining these annotations is very costly in this setting. Results We propose a novel machine learning architecture together with a specialized training procedure, which allows us to infuse a deep neural network with human-powered abstraction on the level of data, leading to a high-performing regression model that requires only a very small amount of labeled data. Specifically, we train a generative model simultaneously on natural and synthetic data, so that it learns a shared representation, from which a target variable, such as the cell count, can be reliably estimated. Availability The project is cross-platform, open-source and free (MIT licensed) software. We make the source code available at https://github.com/dstallmann/cell_cultivation_analysis; the data set is available at https://pub.uni-bielefeld.de/record/2945513
Transfer learning schemes based on deep networks which have been trained on huge image corpora offer state-of-the-art technologies in computer vision. Here, supervised and semi-supervised approaches constitute efficient technologies which work well with comparably small data sets. Yet, such applications are currently restricted to application domains where suitable deep network models are readily available. In this contribution, we address an important application area in the domain of biotechnology, the automatic analysis of CHO-K1 suspension growth in microfluidic single-cell cultivation, where data characteristics are very dissimilar to existing domains and trained deep networks cannot easily be adapted by classical transfer learning. We propose a novel transfer learning scheme which expands a recently introduced Twin-VAE architecture, which is trained on realistic and synthetic data, and we modify its specialized training procedure to the transfer learning domain. In the specific domain, often only few to no labels exist and annotations are costly. We investigate a novel transfer learning strategy, which incorporates a simultaneous retraining on natural and synthetic data using an invariant shared representation as well as suitable target variables, while it learns to handle unseen data from a different microscopy technology. We show the superiority of the variation of our Twin-VAE architecture over the state-of-the-art transfer learning methodology in image processing as well as classical image processing technologies, which persists, even with strongly shortened training times and leads to satisfactory results in this domain. The source code is available at https://github.com/dstallmann/transfer_learning_twinvae, works cross-platform, is open-source and free (MIT licensed) software. We make the data sets available at https://pub.uni-bielefeld.de/record/2960030.
Novel neural network models that can handle complex tasks with fewer examples than before are being developed for a wide range of applications. In some fields, even the creation of a few labels is a laborious task and impractical, especially for data that require more than a few seconds to generate each label. In the biotechnological domain, cell cultivation experiments are usually done by varying the circumstances of the experiments, seldom in such a way that hand-labeled data of one experiment cannot be used in others. In this field, exact cell counts are required for analysis, and even by modern standards, semi-supervised models typically need hundreds of labels to achieve acceptable accuracy on this task, while classical image processing yields unsatisfactory results. We research whether an unsupervised learning scheme is able to accomplish this task without manual labeling of the given data. We present a VAE-based Siamese architecture that is expanded in a cyclic fashion to allow the use of labeled synthetic data. In particular, we focus on generating pseudo-natural images from synthetic images for which the target variable is known to mimic the existence of labeled natural data. We show that this learning scheme provides reliable estimates for multiple microscopy technologies and for unseen data sets without manual labeling. We provide the source code as well as the data we use. The code package is open source and free to use (MIT licensed).
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.