Leaf traits are often strongly correlated with yield, which poses a major challenge in rice breeding. In the present study, using a panel of Vietnamese rice landraces genotyped with 21,623 single-nucleotide polymorphism markers, a genome-wide association study (GWAS) was conducted for several leaf traits during the vegetative stage. Vietnamese landraces are often poorly represented in panels used for GWAS, even though they are adapted to contrasting agrosystems and can contain original, valuable genetic determinants. A panel of 180 rice varieties was grown in pots for four weeks with three replicates under nethouse conditions. Different leaf traits were measured on the second fully expanded leaf of the main tiller, which often plays a major role in determining the photosynthetic capacity of the plant. The leaf fresh weight, turgid weight and dry weight were measured; then, from these measurements, the relative tissue weight and leaf dry matter percentage were computed. The leaf dry matter percentage can be considered a proxy for the photosynthetic efficiency per unit leaf area, which contributes to yield. By a GWAS, thirteen QTLs associated with these leaf traits were identified. Eleven QTLs were identified for fresh weight, eleven for turgid weight, one for dry weight, one for relative tissue weight and one for leaf dry matter percentage. Eleven QTLs presented associations with several traits, suggesting that these traits share common genetic determinants, while one QTL was specific to leaf dry matter percentage and one QTL was specific to relative tissue weight. Interestingly, some of these QTLs colocalize with leaf- or yield-related QTLs previously identified using other material. Several genes within these QTLs with a known function in leaf development or physiology are reviewed.
We propose coding techniques that limit the length of homopolymers runs, ensure the GC-content constraint, and are capable of correcting a single edit error in strands of nucleotides in DNA-based data storage systems. In particular, for given , > 0, we propose simple and efficient encoders/decoders that transform binary sequences into DNA base sequences (codewords), namely sequences of the symbols A, T, C and G, that satisfy the following properties:• Runlength constraint: the maximum homopolymer run in each codeword is at most ,• GC-content constraint: the GC-content of each codeword is within [0.5 − , 0.5 + ],• Error-correction: each codeword is capable of correcting a single deletion, or single insertion, or single substitution error. For practical values of and , we show that our encoders achieve much higher rates than existing results in the literature and approach the capacity. Our methods have low encoding/decoding complexity and limited error propagation.
Tandem duplication in DNA is the process of inserting a copy of a segment of DNA adjacent to the original position. Motivated by applications that store data in living organisms, Jain et al. (2016) proposed the study of codes that correct tandem duplications to improve the reliability of data storage. We investigate algorithms associated with the study of these codes.Two words are said to be k-confusable if there exists two sequences of tandem duplications of lengths at most k such that the resulting words are equal. We demonstrate that the problem of deciding whether two words is kconfusable is linear-time solvable through a characterisation that can be checked efficiently for k = 3. Combining with previous results, the decision problem is linear-time solvable for k 3. We conjecture that this problem is undecidable for k > 3.Using insights gained from the algorithm, we study the size of tandem-duplication codes. We improve the previous known upper bound and then construct codes with larger sizes as compared to the previous constructions. We determine the sizes of optimal tandem-duplication codes for lengths up to twenty, develop recursive methods to construct tandemduplication codes for all word lengths, and compute explicit lower bounds for the size of optimal tandem-duplication codes for lengths from 21 to 30. arXiv:1707.03956v2 [math.CO] 17 Nov 2017 * = ⇒ k y .Therefore, to determine if a set of words is a tandem-duplication code, we need to verify that all pairs of distinct words are not confusable. Hence, we state our problem of interest.
CONFUSABILITY PROBLEMInstance: Two words x and y over Σ q , and an integer k Question: Are x and y k-confusable?While the confusability problem is a natural question, efficient algorithms are only known for the case where k ∈ {1, 2}. We review these results in the next subsection.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.