Chapter 3 Base-R
Congratulations! By this point you should have successfully installed R and R Studio. If not, see the previous chapter.
Specifically you have installed “Base-R”. R is an extendable language, which means you can expand it’s vocabulary to solve problems in new and simplier ways. It’s why it’s so fresh and on the cuting edge.
In this tutorial, we’re going to start with the standard language; “Base-R”. It also provides the foundation for later topics. “Base-R” is by no means basic! There are many people who only use base-r for the entirety of their projects. It’s Base-R not Basic-R.
3.1 Objects and functions
Within R there are two fundemental concepts; objects and functions. Objects are a way of holding information. Functions are a way of manipulating objects. This may seem really complicated but the code below demonstrates this concept.
2+2
## [1] 4
QUESTION: In this example there are 2 objects and 1 function. Can you name them?
.
.
.
The objects are 2, 2
and the function is +
.
To create custom objects in R we use <-
, pronounced “assign”.
x <- 2+2
This would read as “assign the object x
the value of 2+2”.
What do you see underneath the chunk this time? Does 2+2 no longer = 4?
When objects are assigned they usually do not return any output directly. Check over in the Environment pane of RStudio on the right hand side of the screen?
Can you see something called x? What does the environment say about x?
You could also print this out more explicitly by submitting the name of the object in an R chunk.
x
## [1] 4
If you skipped the previous chunk (because it seemed obvious - you know what 2+2 is!) you might have got an error. You need to create an object x before you can look at it. If you missed running a previous chunk, or like skipping to the end then you can press the button to the left of the play button “run all chunks above” (grey triangle above a green line)
3.1.1 Naming objects
When giving things names in R it is better to be a bit more informative than using single letters (x,y,z etc.). We can give R objects any name we want to - but there are a few rules: 1. No punctuation except _ and . 2. No spaces 3. Only standard english alphanumeric characters - no accents 4. Names can include numbers but can’t start with numbers
This is valid
sjkfjhskjdhsajsfgldsjghajfhljhgsdlk <- 2+2
But really stupid - we want our names to be short, clear and memorable
3.1.2 Errors
R is also case sensitive - try running this:
X
## Error in eval(expr, envir, enclos): object 'X' not found
Get used to that error! We have an object called x
but we don’t have anything called X
. Capitalisation and spelling is vital.
As a new user remember that about most of your initial errors are likely to be found by checking the B.S.Q.C. (brackets; spelling; quotation marks; case), or the result of problems with sequencing of loading data or packages.
Its also a good idea to avoid names which are used elsewhere in R for functions as this can cause problems with duplication and/or confusion.
R is big so there are lots of names used for things, so sometimes it happens but try to avoid as much as possible.
Speaking of functions…
3.2 Functions
A function is something that takes in a thing (input) and returns a thing (output). Nearly all functions get called like this
functionname(input)
Most functions also work like this
functionname(input_1, input_2, ...)
A super useful function is c() and is everywhere is R, but does not have a very informative name. It is short for combine since it combines a bunch of stuff together.
c(1,2,3,4,5)
## [1] 1 2 3 4 5
Think of c()
taking the inputs and making a list from them. In our example, the first item in the list is 1
, the second being 2
, ect.
Question: Modify the chunk above so that this is assigned to an object called y
You could now use another function to get ther average of those numbers
mean(y)
## Error in mean(y): object 'y' not found
Remember - this line will only work if you succesfully assigned the object y.
y<-c(1,2,3,4,5)
mean(y)
## [1] 3
3.2.1 Help
Throughout this tutorial you will be introduced to a lot of new functions with even more options. We don’t expect you to memorise them. You are learning where to look.
You can get help within R by using a question mark followed by the name of the function.
?mean
The help files will list the options and a brief description. Once you know what you are looking for, it’s easy to read.
There are always worked examples in the help file along with each function, which are often more useful than the help menus themselves. This can be found using example()
example(mean)
##
## mean> x <- c(0:10, 50)
##
## mean> xm <- mean(x)
##
## mean> c(xm, mean(x, trim = 0.10))
## [1] 8.75 5.50
But ?
and example()
only work if you know the name of the function!
There are other ways of checking within R for how to do things; but at this point you are much better off heading to Google.
https://www.google.co.uk/search?q=How+do+I+calculate+a+mean+in+R
3.3 Packages
Up to this point, we’ve been using Base-R.
A really nice thing about R is that anyone can write extensions to R called packages. There are lots of really useful packages that make working with R much easier, others that let you make really nice graphics, or fit clever statistical models.
In this course we will use some of these packages so let’s set up now by making sure we have everything we need. To get them we need to download them from the internet and install them into our version of R. To do so, run the following code;
install.packages(c("ggplot2","dplyr","openxlsx", "tidyr", "plotly"))
We will talk about these packages when we need them later - don’t worry too much about what each of them do for now.
3.4 Quotation Marks
Also note that each of the packages in the previous line are in quotation marks. This is because they are not things that exist in our current session of R (yet).
Look at the difference between these two lines
x
## [1] 0 1 2 3 4 5 6 7 8 9 10 50
"x"
## [1] "x"
Something in quotation marks is treated literally as what is there. Something not in quotation marked is evaluated.
If we use the name of something not in quotation marks and it is not the name of something already in our R session we get an error