2023
DOI: 10.1101/2023.02.24.529975
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

PIFiA: Self-supervised Approach for Protein Functional Annotation from Single-Cell Imaging Data

Abstract: Fluorescence microscopy data describe protein localization patterns at single-cell resolution and have the potential to reveal whole-proteome functional information with remarkable precision. Yet, extracting biologically meaningful representations from cell micrographs remains a major challenge. Existing approaches often fail to learn robust and noise-invariant features or rely on supervised labels for accurate annotations. We developed PIFiA, (Protein Image-based Functional Annotation), a self-supervised appr… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
1
1
1

Relationship

1
2

Authors

Journals

citations
Cited by 3 publications
(5 citation statements)
references
References 74 publications
(173 reference statements)
0
5
0
Order By: Relevance
“…Weakly supervised representation-learning uses experimental labels, such as the treatment label, to "guide" their representations such that replicates of the same label are encoded close to one another in the latent space and different labels are encoded far apart [26][27][28][29][30][31]. Beyond optimizing representations that are hard to biologically-interpret, experiment treatment-based weak supervision may lead to undesired consequences where the representations of treatments that lead to a similar phenotype are pushed away from one another in the latent space because they do not share the same label, possibly even pushing one representation closer to the control (Supplementary Figure 4A).…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Weakly supervised representation-learning uses experimental labels, such as the treatment label, to "guide" their representations such that replicates of the same label are encoded close to one another in the latent space and different labels are encoded far apart [26][27][28][29][30][31]. Beyond optimizing representations that are hard to biologically-interpret, experiment treatment-based weak supervision may lead to undesired consequences where the representations of treatments that lead to a similar phenotype are pushed away from one another in the latent space because they do not share the same label, possibly even pushing one representation closer to the control (Supplementary Figure 4A).…”
Section: Discussionmentioning
confidence: 99%
“…Thus, the biological function investigated may be incorrectly interpreted due to a simplified representation of the underlying data complexity. Recent representation-learning methods train machine learning models to encode lower-dimensional cell-level or well-level embeddings (called "latent representations") through weakly supervised learning [26][27][28][29][30][31] or self supervised learning [32]. While representation-learning methods show promising results in terms of capturing the differences between treatments, current representation methods are not optimized to model the change posed by the treatment with respect to the control, but to distinguish between the different treatments (see Discussion).…”
Section: Introductionmentioning
confidence: 99%
“…[ 26 ] Similarly, a self‐supervised CNN was used on the yeast ORF‐GFP collection to group proteins based on their localization and function, without the need for manual compartment labeling. [ 25 ] Additionally, a self‐supervised method was used to learn feature representations of single cells without labeled training data in both yeast and human microscopy images. Given one cell from a microscopy image, the trained CNN was able to predict the fluorescence pattern in another cell from the same image.…”
Section: General Workflow Of Bioimage Analysismentioning
confidence: 99%
“…[24] Predicted outputs of supervised ML/DL models can be overfitted protein interaction modules based on subcellular localization. [25] Unsupervised ML/DL methods are used for pattern detection. An example of unsupervised DL algorithms are autoencoders that capture the most important features of the input data and can be used, for example, to reduce the dimensionality of data, denoise data, or extract DL features.…”
Section: Fundamentals Of ML and Dlmentioning
confidence: 99%
See 1 more Smart Citation