Sézary syndrome (SS) is an aggressive cutaneous T cell lymphoma with pruritic skin inflammation and immune dysfunction, driven by neoplastic, clonal memory T cells in both peripheral blood and skin. To gain insight into abnormal gene expression promoting T cell dysfunction, lymphoproliferation and transformation in SS, we first compared functional transcriptomic profiles of both resting and activated CD4
+
CD45RO
+
T cells from SS patients and normal donors to identified differential expressed genes. Next, a meta-analysis was performed to compare our SS data to public microarray data from a novel benign disease control, lymphocytic-variant hypereosinophilic syndrome (L-HES). L-HES is a rare, clonal lymphoproliferation of abnormal memory T cells that produces similar clinical symptoms as SS, including severe pruritus and eosinophilia. Comparison revealed gene sets specific for either SS (370 genes) or L-HES (519 genes), and a subset of 163 genes that were dysregulated in both SS and L-HES T cells compared to normal donor T cells. Genes confirmed by RT-qPCR included elevated expression of PLS3,
TWIST1
and
TOX
only in SS, while
IL17RB
mRNA was increased only in L-HES.
CDCA7
was increased in both diseases. In an L-HES patient who progressed to peripheral T cell lymphoma, the malignant transformation identified increases in the expression of
CDCA7
,
TIGIT
, and
TOX
, which are highly expressed in SS, suggesting that these genes contribute to neoplastic transformation. In summary, we have identified gene expression biomarkers that implicate a common transformative mechanism and others that are unique to differentiate SS from L-HES.
Genomic DNA is the best "unique identifier" for organisms. Alignment-free phylogenomic analysis, simple, fast, and efficient method to compare genome sequences, relies on looking at the distribution of small DNA sequence of a particular length, referred to as k-mer. The k-mer approach has been explored as a basis for sequence analysis applications, including assembly, phylogenetic tree inference, and classification. Although this approach is not novel, selecting the appropriate k-mer length to obtain the optimal resolution is rather arbitrary. However, it is a very important parameter for achieving the appropriate resolution for genome/sequence distances to infer biologically meaningful phylogenetic relationships. Thus, there is a need for a systematic approach to identify the appropriate k-mer from whole-genome sequences. We present K-mer-length Iterative Selection for UNbiased Ecophylogenomics (KITSUNE), a tool for assessing the empirically optimal k-mer length of any given set of genomes of interest for phylogenomic analysis via a three-step approach based on (1) cumulative relative entropy (CRE), (2) average number of common features (ACF), and (3) observed common features (OCF). Using KITSUNE, we demonstrated the feasibility and reliability of these measurements to obtain empirically optimal k-mer lengths of 11, 17, and ∼34 from large genome datasets of viruses, bacteria, and fungi, respectively. Moreover, we demonstrated a feature of KITSUNE for accurate species identification for the two de novo assembled bacterial genomes derived from error-prone long-reads sequences, and for a published yeast genome. In addition, KITSUNE was used to identify the shortest species-specific k-mer accurately identifying viruses. KITSUNE is freely available at https://github.com/natapol/kitsune.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.