Abstract:A self-inactivating CD-tagging retroviral vector was used to introduce epitope and GFP tags into genes and proteins in NIH 3T3 cells. Several hundred cell clones, each expressing GFP fluorescence in a distinctive pattern, were isolated. Molecular analysis showed that a wide variety of genes and proteins, some known and some newly discovered, had been tagged. The analysis also revealed that, in the great majority of instances, the abundance and cellular location of the tagged protein mirrored that of its untagg… Show more
“…In addition to these datasets, a major National Cancer Institute-funded project led by Jonathan W. Jarvik, Peter B. Berget and Robert F. Murphy is beginning to provide images of the subcellular location of randomlytagged proteins in 3T3 cells. Preliminary wide-field 2D images (40×, pixel size 0.475 microns) have been acquired for approximately 100 clones produced by random CD-tagging [5][6][7]. By sequencing DNA adjacent to the tag, the tagged gene has been identified (if it is present in current sequence databases).…”
Section: Sources Of Data On Subcellular Locationmentioning
Abstract. The ongoing biotechnology revolution promises a complete understanding of the mechanisms by which cells and tissues carry out their functions. Central to that goal is the determination of the function of each protein that is present in a given cell type, and determining a protein's location within cells is critical to understanding its function. As large amounts of data become available from genome-wide determination of protein subcellular location, automated approaches to categorizing and comparing location patterns are urgently needed. Since subcellular location is most often determined using fluorescence microscopy, we have developed automated systems for interpreting the resulting images. We report here improved numeric features for describing such images that are fairly robust to image intensity binning and spatial resolution. We validate these features by using them to train neural networks that accurately recognize all major subcellular patterns with an accuracy higher than any previously reported. Having validated the features by using them for classification, we also demonstrate using them to create Subcellular Location Trees that group similar proteins and provide a systematic framework for describing subcellular location.
“…In addition to these datasets, a major National Cancer Institute-funded project led by Jonathan W. Jarvik, Peter B. Berget and Robert F. Murphy is beginning to provide images of the subcellular location of randomlytagged proteins in 3T3 cells. Preliminary wide-field 2D images (40×, pixel size 0.475 microns) have been acquired for approximately 100 clones produced by random CD-tagging [5][6][7]. By sequencing DNA adjacent to the tag, the tagged gene has been identified (if it is present in current sequence databases).…”
Section: Sources Of Data On Subcellular Locationmentioning
Abstract. The ongoing biotechnology revolution promises a complete understanding of the mechanisms by which cells and tissues carry out their functions. Central to that goal is the determination of the function of each protein that is present in a given cell type, and determining a protein's location within cells is critical to understanding its function. As large amounts of data become available from genome-wide determination of protein subcellular location, automated approaches to categorizing and comparing location patterns are urgently needed. Since subcellular location is most often determined using fluorescence microscopy, we have developed automated systems for interpreting the resulting images. We report here improved numeric features for describing such images that are fairly robust to image intensity binning and spatial resolution. We validate these features by using them to train neural networks that accurately recognize all major subcellular patterns with an accuracy higher than any previously reported. Having validated the features by using them for classification, we also demonstrate using them to create Subcellular Location Trees that group similar proteins and provide a systematic framework for describing subcellular location.
“…However, we have previously proposed that unsupervised methods are more appropriate to the analysis of protein subcellular location patterns [4]. We have used the retroviral CD-tagging technology developed by Jarvik, Berget and colleagues [12] to collect increasing numbers of images of mouse 3T3 cells expressing proteins randomly-tagged with GFP and then cluster them into Subcellular Location Trees [6,13,14]. As the number of tagged lines examined has increased, the number of statistically distinguishable clusters has also increased ( Table 1).…”
Section: Learning Subcellular Patterns Using Cluster Analysis: Subcelmentioning
A major source of information for identifying subcellular location on a proteome-wide basis will be imaging of tagged proteins in living cells using fluorescence microscopy. We have previously developed automated systems to interpret images from such experiments and demonstrated that they can perform as well or better than visual inspection. Recent work demonstrates that these methods can be applied to large collections of images from sources as diverse as yeast expressing GFP-tagged proteins and human tissues imaged by immunocytochemistry. A distinct but related task is learning what location patterns exist. We have demonstrated clustering of mouse proteins into subcellular location families that share a statistically indistinguishable pattern. To communicate each pattern, we have developed approaches to learning generative models of subcellular patterns. Integration of high-throughput microscopy and automated model building with cell modeling systems will permit accurate, well-structured information on subcellular location to be incorporated into systems biology efforts.
“…Some large-scale projects have used fluorescence microscopy to screen hundreds to thousands or proteins for particular patterns or to assign proteins to major location classes. 11,13,20,22 A particular ambitious and valuable project has been the tagging of all predicted protein coding regions in the yeast Saccharomyces cerevisiae.…”
Section: Determination Of Protein Locationmentioning
confidence: 99%
“…Previous studies have shown that CD-tagging has minimal impact on protein folding, function and localization. 13 Here, we combine CD-tagging, automated microscopy and automated analysis to identify statistically distinguishable location patterns NIH 3T3 cells. We present the combination of high-throughput methods from tagging to analysis as well as fully automated methods of imaging and analysis.…”
Section: Cd-tagging Of Nih 3t3 Cellsmentioning
confidence: 99%
“…The procedure described previously 13 was followed, with some minor alterations. A CD-tagging cassette containing the EGFP coding sequence was packaged into retrovirus using Phoenix-GP cells.…”
Section: Production and Isolation Of Cd-tagged Nih 3t3 Cellsmentioning
Abstract-Location proteomics is concerned with the systematic analysis of the subcellular location of proteins. In order to perform high-resolution, high-throughput analysis of all protein location patterns, automated methods are needed. Here we describe the use of such methods on a large collection of images obtained by automated microscopy to perform high-throughput analysis of endogenous proteins randomly-tagged with a fluorescent protein in NIH 3T3 cells. Cluster analysis was performed to identify the statistically significant location patterns in these images. This allowed us to assign a location pattern to each tagged protein without specifying what patterns are possible. To choose the best feature set for this clustering, we have used a novel method that determines which features do not artificially discriminate between control wells on different plates and uses Stepwise Discriminant Analysis (SDA) to determine which features do discriminate as much as possible among the randomly-tagged wells. Combining this feature set with consensus clustering methods resulted in 35 clusters among the first 188 clones we obtained. This approach represents a powerful automated solution to the problem of identifying subcellular locations on a proteome-wide basis for many different cell types.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.