2.5 Call functions - Video Tutorials & Practice Problems
Video duration:
4m
Play a video:
<v Voiceover>Using functions</v> is an important part of R and they can act differently depending on the type of input. Let's go ahead and create the vector x out of one through 10 and let's take the mean of this vector. So type in mean and you just give it the vector and it automatically treats it as one unit and computes the mean. The same could be said for sum. This will just add up all the elements. This is different than, say, nchar, where it's gonna give us an element by element operation. So it's important to understand which functions take in a vector as its entire first argument. Functions operate individually on each element of a vector. And they are very different operations. A function in R can have multiple arguments and they are not always necessarily needed. So, for instance, let's look at mean. Type in mean and using our studio once we have the parenthesis open we can hit the tab key. And that tells us the four different arguments available. The first one, x, is a vector. For instance, one through 10. The second argument, is for further arguments, it's a catch all. We won't concern ourselves with that. You also have trim which can see, it's another argument saying, "Hey, what percentage "of observations do we want to take off from each end?" And na.rm which means if we have NAs what should we do? And we will illustrate that in a moment. So for now, let's just say x gets in our case x and trim gets .1. That means we'll get rid of 10% of the observations on either side. And we can see, due to the nature of this, it doesn't make a difference. It still gives us back 5.5. But it's just another option. If we had written it like this, (x, .1), R would have inferred that .1 was for trim and not for na.rm because when R matches arguments, now let me pull this up again, it can either infer them positionally, for instance the first argument I entered should go to the first element and the second argument I enter should go to the second element. And yes I see that there's these ... in between That's a catch all argument, it infers them positionally. If, however, I enter something like this. (x, na.rm=TRUE) It still give us the same result just due to the nature of what we looked at but it figured out that na.rm goes with the fourth argument because I named it. So, you can specify arguments either positionally, which can lead to some ambiguity. Or explicitly name them, as na.rm=TRUE and so on. Now as long as we're talking about na.rm=TRUE, that's used when there are missing values. So let's get rid of this line and randomly insert two missing values into x. So we'll say x and we'll say the second one and the sixth one gets NA. So looking at that we have 1, NA, 3, 4, 5, NA, 7, 8, 9, 10. Calling mean on x now returns NA 'cause if there's even one NA in the vector it returns NA. Now, if mean(x, na.rm=TRUE) it removes those NAs and then computes the mean. So that's a helpful little trick to have. So it's important to remember when looking at functions what type of arguments they take. Do they operate on a vector one element at a time? Or do they operate on a vector as a whole? And when you specify arguments to a function, it's important to specify them correctly. And that could be positionally, but to remove any ambiguity it's also good to specify the name of the argument.