In this article we introduce a
robusta
coffee leaf images dataset called RoCoLe. The dataset contains 1560 leaf images with visible red mites and spots (denoting coffee leaf rust presence) for infection cases and images without such structures for healthy cases. In addition, the data set includes annotations regarding objects (leaves), state (healthy and unhealthy) and the severity of disease (leaf area with spots). Images were all obtained in real-world conditions in the same coffee plants field using a smartphone camera. RoCoLe data set facilitates the evaluation of the performance of machine learning algorithms used in image segmentation and classification problems related to plant diseases recognition. The current dataset is freely and publicly available at
https://doi.org/10.17632/c5yvn32dzg.2
.
BackgroundBiologists aim to understand the genetic background of diseases, metabolic disorders or any other genetic condition. Microarrays are one of the main high-throughput technologies for collecting information about the behaviour of genetic information on different conditions. In order to analyse this data, clustering arises as one of the main techniques used, and it aims at finding groups of genes that have some criterion in common, like similar expression profile. However, the problem of finding groups is normally multi dimensional, making necessary to approach the clustering as a multi-objective problem where various cluster validity indexes are simultaneously optimised. They are usually based on criteria like compactness and separation, which may not be sufficient since they can not guarantee the generation of clusters that have both similar expression patterns and biological coherence.MethodWe propose a Multi-Objective Clustering algorithm Guided by a-Priori Biological Knowledge (MOC-GaPBK) to find clusters of genes with high levels of co-expression, biological coherence, and also good compactness and separation. Cluster quality indexes are used to optimise simultaneously gene relationships at expression level and biological functionality. Our proposal also includes intensification and diversification strategies to improve the search process.ResultsThe effectiveness of the proposed algorithm is demonstrated on four publicly available datasets. Comparative studies of the use of different objective functions and other widely used microarray clustering techniques are reported. Statistical, visual and biological significance tests are carried out to show the superiority of the proposed algorithm.ConclusionsIntegrating a-priori biological knowledge into a multi-objective approach and using intensification and diversification strategies allow the proposed algorithm to find solutions with higher quality than other microarray clustering techniques available in the literature in terms of co-expression, biological coherence, compactness and separation.
There is no consensus as to how a precursor lesion, germ cell neoplasia in situ (GCNIS), develops into the histologic types of testicular germ cell tumor type II (TGCT). The present meta-analysis examined RNA expressions of 24 candidate genes in three datasets. They included 203 samples of normal testis (NT) and histologic types of TGCT. The Fisher’s test for combined p values was used for meta-analysis of the RNA expressions in the three datasets. The histologic types differed in RNA expression of PRAME, KIT, SOX17, NANOG, KLF4, POU5F1, RB1, DNMT3B, and LIN28A (p < 0.01). The histologic types had concordant differences in RNA expression of the genes in the three datasets. Eight genes had overlap with a high RNA expression in at least two histologic types. In contrast, only seminoma (SE) had a high RNA expression of KLF4 and only embryonal carcinoma (EC) had a high RNA expression of DNMT3B. In conclusion, the meta-analysis showed that the development of the histologic types of TGCT was driven by changes in RNA expression of candidate genes. According to the RNA expressions of the ten genes, TGCT develops from NT over GCNIS, SE, EC, to the differentiated types of TGCT.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.