BackgroundSet comparisons permeate a large number of data analysis workflows, in particular workflows in biological sciences. Venn diagrams are frequently employed for such analysis but current tools are limited.ResultsWe have developed InteractiVenn, a more flexible tool for interacting with Venn diagrams including up to six sets. It offers a clean interface for Venn diagram construction and enables analysis of set unions while preserving the shape of the diagram. Set unions are useful to reveal differences and similarities among sets and may be guided in our tool by a tree or by a list of set unions. The tool also allows obtaining subsets’ elements, saving and loading sets for further analyses, and exporting the diagram in vector and image formats. InteractiVenn has been used to analyze two biological datasets, but it may serve set analysis in a broad range of domains.ConclusionsInteractiVenn allows set unions in Venn diagrams to be explored thoroughly, by consequence extending the ability to analyze combinations of sets with additional observations, yielded by novel interactions between joined sets. InteractiVenn is freely available online at: www.interactivenn.net.
To contribute to our understanding of the genome complexity of sugarcane, we undertook a large-scale expressed sequence tag (EST) program. More than 260,000 cDNA clones were partially sequenced from 26 standard cDNA libraries generated from different sugarcane tissues. After the processing of the sequences, 237,954 high-quality ESTs were identified. These ESTs were assembled into 43,141 putative transcripts. Of the assembled sequences, 35.6% presented no matches with existing sequences in public databases. A global analysis of the whole SUCEST data set indicated that 14,409 assembled sequences (33% of the total) contained at least one cDNA clone with a full-length insert. Annotation of the 43,141 assembled sequences associated almost 50% of the putative identified sugarcane genes with protein metabolism, cellular communication/signal transduction, bioenergetics, and stress responses. Inspection of the translated assembled sequences for conserved protein domains revealed 40,821 amino acid sequences with 1415 Pfam domains. Reassembling the consensus sequences of the 43,141 transcripts revealed a 22% redundancy in the first assembling. This indicated that possibly 33,620 unique genes had been identified and indicated that >90% of the sugarcane expressed genes were tagged
Different regions of oral squamous cell carcinoma (OSCC) have particular histopathological and molecular characteristics limiting the standard tumor−node−metastasis prognosis classification. Therefore, defining biological signatures that allow assessing the prognostic outcomes for OSCC patients would be of great clinical significance. Using histopathology-guided discovery proteomics, we analyze neoplastic islands and stroma from the invasive tumor front (ITF) and inner tumor to identify differentially expressed proteins. Potential signature proteins are prioritized and further investigated by immunohistochemistry (IHC) and targeted proteomics. IHC indicates low expression of cystatin-B in neoplastic islands from the ITF as an independent marker for local recurrence. Targeted proteomics analysis of the prioritized proteins in saliva, combined with machine-learning methods, highlights a peptide-based signature as the most powerful predictor to distinguish patients with and without lymph node metastasis. In summary, we identify a robust signature, which may enhance prognostic decisions in OSCC and better guide treatment to reduce tumor recurrence or lymph node metastasis.
Composting operations are a rich source for prospection of biomass degradation enzymes. We have analyzed the microbiomes of two composting samples collected in a facility inside the São Paulo Zoo Park, in Brazil. All organic waste produced in the park is processed in this facility, at a rate of four tons/day. Total DNA was extracted and sequenced with Roche/454 technology, generating about 3 million reads per sample. To our knowledge this work is the first report of a composting whole-microbial community using high-throughput sequencing and analysis. The phylogenetic profiles of the two microbiomes analyzed are quite different, with a clear dominance of members of the Lactobacillus genus in one of them. We found a general agreement of the distribution of functional categories in the Zoo compost metagenomes compared with seven selected public metagenomes of biomass deconstruction environments, indicating the potential for different bacterial communities to provide alternative mechanisms for the same functional purposes. Our results indicate that biomass degradation in this composting process, including deconstruction of recalcitrant lignocellulose, is fully performed by bacterial enzymes, most likely by members of the Clostridiales and Actinomycetales orders.
The task of building effective representations to visualize and explore collections with moderate to large number of documents is hard. It depends on the evaluation of some distance measure among texts and also on the representation of such relationships in bidimensional spaces. In this paper we introduce an alternative approach for building visual maps of documents based on their content similarity, through reconstruction of phylogenetic trees. The tree is capable of representing relationships that allows the user to quickly recover information detected by the similarity metric. For a variety of text collections of different natures we show that we can achieve improved exploration capability and more clear visualization of relationships amongst documents.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.