This book takes a simple step-by-step approach to give a good grounding in the use of R for undergraduate/beginning postgraduate biology students. R is a freely available, open-source statistical programming environment which provides powerful statistical analysis tools and graphics outputs. This chapter provides some steps on how to use the book, from setting up the computer to running the code as you go along. The chapter structure is also introduced.
This chapter employs generalized linear modelling using the function glm when we know that variances are not constant with one or more explanatory variables and/or we know that the errors cannot be normally distributed, for example, they may be binary data, or count data where negative values are impossible, or proportions which are constrained between 0 and 1. A glm seeks to determine how much of the variation in the response variable can be explained by each explanatory variable, and whether such relationships are statistically significant. The data for generalized linear models take the form of a continuous response variable and a combination of continuous and discrete explanatory variables.
This chapter presents the basics for handling text, numbers and simple data files. It focuses on basic R features, commas, brackets and concatenation, colon character, raise to the power symbol, exiting from R, and help pages.
R is an open-source statistical environment modelled after the previously widely used commercial programs S and S-Plus, but in addition to powerful statistical analysis tools, it also provides powerful graphics outputs. In addition to its statistical and graphical capabilities, R is a programming language suitable for medium-sized projects. This book presents a set of studies that collectively represent almost all the R operations that beginners, analysing their own data up to perhaps the early years of doing a PhD, need. Although the chapters are organized around topics such as graphing, classical statistical tests, statistical modelling, mapping and text parsing, examples have been chosen based largely on real scientific studies at the appropriate level and within each the use of more R functions is nearly always covered than are simply necessary just to get a p-value or a graph. R comes with around a thousand base functions which are automatically installed when R is downloaded. This book covers the use of those of most relevance to biological data analysis, modelling and graphics. Throughout each chapter, the functions introduced and used in that chapter are summarized in Tool Boxes. The book also shows the user how to adapt and write their own code and functions. A selection of base functions relevant to graphics that are not necessarily covered in the main text are described in Appendix 1, and additional housekeeping functions in Appendix 2.
Analysis of variance is used to analyze the differences between group means in a sample, when the response variable is numeric (real numbers) and the explanatory variable(s) are all categorical. Each explanatory variable may have two or more factor levels, but if there is only one explanatory variable and it has only two factor levels, one should use Student's t-test and the result will be identical. Basically an ANOVA fits an intercept and slopes for one or more of the categorical explanatory variables. ANOVA is usually performed using the linear model function lm, or the more specific function aov, but there is a special function oneway.test when there is only a single explanatory variable. For a one-way ANOVA the non-parametric equivalent (if variance assumptions are not met) is the kruskal.test.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.