Before we Start
- Use RStudio to write and run R programs.
- Use
install.packages()to install packages (libraries).
Introduction to R
- Access individual values by location using
[]. - Access arbitrary sets of data using
[c(...)]. - Use logical operations and logical vectors to access subsets of data.
Starting with Data
- Use read_csv to read tabular data in R.
- Use factors to represent categorical data in R.
Data Wrangling with dplyr
- Use the
dplyrpackage to manipulate dataframes. - Use
select()to choose variables from a dataframe. - Use
filter()to choose data based on values. - Use
group_by()andsummarize()to work with subsets of data. - Use
mutate()to create new variables.
Data Wrangling with tidyr
- Use the
tidyrpackage to change the layout of data frames. - Use
pivot_wider()to go from long to wide format. - Use
pivot_longer()to go from wide to long format.
Data Visualisation with ggplot2
-
ggplot2is a flexible and useful tool for creating plots in R. - The data set and coordinate system can be defined using the
ggplotfunction. - Additional layers, including geoms, are added using the
+operator. - Boxplots are useful for visualizing the distribution of a continuous variable.
- Barplots are useful for visualizing categorical data.
- Faceting allows you to generate multiple plots based on a categorical variable.
Writing Good Software
- Keep your project folder structured, organized and tidy.
- Document what and why, not how.
- Break programs into short single-purpose functions.
- Write re-runnable tests.
- Don’t repeat yourself.
- Be consistent in naming, indentation, and other aspects of style.
Getting started with R Markdown (optional)
- R Markdown is a useful language for creating reproducible documents combining text and executable R-code.
- Specify chunk options to control formatting of the output document