The genome of the marine alga Ulva compressa was assembled using long and short reads. The genome assembly was 80.8 Mb in size and encoded 19,207 protein-coding genes. Several genes encoding antioxidant enzymes and a few genes encoding enzymes that synthesize ascorbate and glutathione were identified, showing similarity to plant and bacterial enzymes. Additionally, several genes encoding signal transduction protein kinases, such as MAPKs, CDPKS, CBLPKs, and CaMKs, were also detected, showing similarity to plants, green microalgae, and bacterial proteins. Regulatory transcription factors, such as ethylene- and ABA-responsive factors, MYB, WRKY, and HSTF, were also present and showed similarity to plant and green microalgae transcription factors. Genes encoding enzymes that synthesize ACC and ABA-aldehyde were also identified, but oxidases that synthesize ethylene and ABA, as well as enzymes that synthesize other plant hormones, were absent. Interestingly, genes involved in plant cell wall synthesis and proteins related to animal extracellular matrix were also detected. Genes encoding cyclins and CDKs were also found, and CDKs showed similarity to animal and fungal CDKs. Few genes encoding voltage-dependent calcium channels and ionotropic glutamate receptors were identified as showing similarity to animal channels. Genes encoding Transient Receptor Potential (TRP) channels were not identified, even though TRPs have been experimentally detected, indicating that the genome is not yet complete. Thus, protein-coding genes present in the genome of U. compressa showed similarity to plant and green microalgae, but also to animal, bacterial, and fungal genes.
Background Despite representing the largest fraction of animal life, the number of insect species whose genome has been sequenced is barely in the hundreds. The order Dermaptera (the earwigs) suffers from a lack of genomic information despite its unique position as one of the basally derived insect groups and its importance in agroecosystems. As part of a national educational and outreach program in genomics, a plan was formulated to engage the participation of high school students in a genome sequencing project. Students from twelve schools across Chile were instructed to capture earwig specimens in their geographical area, to identify them and to provide material for genome sequencing to be carried out by themselves in their schools. Results The school students collected specimens from two cosmopolitan earwig species: Euborellia annulipes (Fam. Anisolabididae) and Forficula auricularia (Fam. Forficulidae). Genomic DNA was extracted and, with the help of scientific teams that traveled to the schools, was sequenced using nanopore sequencers. The sequence data obtained for both species was assembled and annotated. We obtained genome sizes of 1.18 Gb (F. auricularia) and 0.94 Gb (E. annulipes) with the number of predicted protein coding genes being 31,800 and 40,000, respectively. Our analysis showed that we were able to capture a high percentage (≥ 93%) of conserved proteins indicating genomes that are useful for comparative and functional analysis. We were also able to characterize structural elements such as repetitive sequences and non-coding RNA genes. Finally, functional categories of genes that are overrepresented in each species suggest important differences in the process underlying the formation of germ cells, and modes of reproduction between them, features that are one of the distinguishing biological properties that characterize these two distant families of Dermaptera. Conclusions This work represents an unprecedented instance where the scientific and lay community have come together to collaborate in a genome sequencing project. The versatility and accessibility of nanopore sequencers was key to the success of the initiative. We were able to obtain full genome sequences of two important and widely distributed species of insects which had not been analyzed at this level previously. The data made available by the project should illuminate future studies on the Dermaptera.
Metagenomics is an area of microbiology that deals with the taxonomic classification of genomic samples taken directly from the environment. These samples are sequences of variable length and they may correspond to different species, some of which may be unknown or not previously stored in a genomic database. One of the main steps in metagenomics classification correspond to binning the sequence fragments into groups that may correspond to one species. Many approaches are used to perform binning, mainly machine learning algorithms to perform classification or clustering. This paper presents the results of an empirical evaluation of two well-known unsupervised algorithms to perform the metagenomics binning task: the EM versus the K-means algorithms. Both algorithms are tested on short and long reads of synthetic datasets, with different proportions and number of species. These empirical results show that K-means in general outperforms the EM algorithm, but EM results competitive in several of the short reads datasets used for evaluation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.