2018
DOI: 10.1093/bioinformatics/bty784
|View full text |Cite
|
Sign up to set email alerts
|

The GCTx format and cmap{Py, R, M, J} packages: resources for optimized storage and integrated traversal of annotated dense matrices

Abstract: Software packages (available in Python, R, Matlab, and Java) are freely available at https://github.com/cmap. Additional instructions, tutorials, and datasets are available at clue.io/code.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
25
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
6
3
1

Relationship

0
10

Authors

Journals

citations
Cited by 39 publications
(25 citation statements)
references
References 13 publications
0
25
0
Order By: Relevance
“…In short, the L1000 is a highthroughput gene expression assay that measures the expression of 978 "landmark" genes from human cells [18] which can be used to computationally infer the expression of 11,350 genes. The data used in the present study was generated in GCTx format, which stored annotated data matrices [20]. The expression level analysis of HIF-1α, the landmark gene in the present study, was obtained from level 5moderated Z-score (MODZ).…”
Section: Analysis Of the Association Between Hdac Inhibitors And Exprmentioning
confidence: 99%
“…In short, the L1000 is a highthroughput gene expression assay that measures the expression of 978 "landmark" genes from human cells [18] which can be used to computationally infer the expression of 11,350 genes. The data used in the present study was generated in GCTx format, which stored annotated data matrices [20]. The expression level analysis of HIF-1α, the landmark gene in the present study, was obtained from level 5moderated Z-score (MODZ).…”
Section: Analysis Of the Association Between Hdac Inhibitors And Exprmentioning
confidence: 99%
“…The compendia contain both microarray and RNA-seq data, quantile normalized and with missing values imputed using SVD-impute (Perry 2009) (430,119 human, 228,708 mouse samples). We extracted the microarray data (330,508 human, 123,279 mouse samples) and converted the data to gctx format to aid analysis (Enache et al 2019) ; no other transformations were applied to them. All RNA-seq libraries for human and mouse were downloaded from refine.bio (122,864 human, 125,652 mouse).…”
Section: Dataset Construction 1-expression Data Pre-processingmentioning
confidence: 99%
“…We downloaded the 5th level of differential gene expression signatures released on the GEO website (GEO accession GSE70138) which includes 107,404 profiles corresponding to 1,768 different drugs in 41 cell lines, 83 concentrations, and four treatment durations. Data access was performed through the cmapR package 41 . Genes in each drug profile were ranked according to their expression, from the most up-regulated to the most down-regulated gene.…”
Section: Gene Expression Datamentioning
confidence: 99%