Single-cell genomic technologies are rapidly improving our understanding of cellular heterogeneity in biological systems. In recent years, technological and computational improvements have continuously increased the scale of single-cell experiments, and now allow for millions of cells to be analyzed in a single experiment. However, existing software tools for single-cell analysis do not scale well to such large datasets. RAPIDS is an open-source suite of Python libraries that use GPU computing to accelerate data science workflows. Here, we report the use of RAPIDS and GPU computing to accelerate single-cell genomic analysis workflows and present open-source examples that can be reused by the community.
Recent work on the lottery ticket hypothesis has produced highly sparse Transformers for NMT while maintaining BLEU. However, it is unclear how such pruning techniques affect a model's learned representations. By probing sparse Transformers, we find that complex semantic information is first to be degraded. Analysis of internal activations reveals that higher layers diverge most over the course of pruning, gradually becoming less complex than their dense counterparts. Meanwhile, early layers of sparse models begin to perform more encoding. Attention mechanisms remain remarkably consistent as sparsity increases.
Data ethics and fairness have emerged as important areas of research in recent years. However, much work in this area focuses on retroactively auditing and "mitigating bias" in existing, potentially flawed systems, without interrogating the deeper structural inequalities underlying them. There are not yet examples of how to apply feminist and participatory methodologies from the start, to conceptualize and design machine learning-based tools that center and aim to challenge power inequalities. Our work targets this more prospective goal. Guided by the framework of data feminism, we co-design datasets and machine learning models to support the efforts of activists who collect and monitor data about feminicide -gender-based killings of women and girls. We describe how intersectional feminist goals and participatory processes shaped each stage of our approach, from problem conceptualization to data collection to model evaluation. We highlight several methodological contributions, including 1) an iterative data collection and annotation process that targets model weaknesses and interrogates framing concepts (such as who is included/excluded in "feminicide"), 2) models that explicitly focus on intersectional identities rather than statistical majorities, and 3) a multi-step evaluation process -with quantitative, qualitative and participatory stepsfocused on context-specific relevance. We also distill insights and tensions that arise from bridging intersectional feminist goals with ML. These include reflections on how ML may challenge power, embrace pluralism, rethink binaries and consider context, as well as the inherent limitations of any technology-based solution to address durable structural inequalities.
Recent work on the lottery ticket hypothesis has produced highly sparse Transformers for NMT while maintaining BLEU. However, it is unclear how such pruning techniques affect a model's learned representations. By probing Transformers with more and more lowmagnitude weights pruned away, we find that complex semantic information is first to be degraded. Analysis of internal activations reveals that higher layers diverge most over the course of pruning, gradually becoming less complex than their dense counterparts. Meanwhile, early layers of sparse models begin to perform more encoding. Attention mechanisms remain remarkably consistent as sparsity increases.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.