3.3 Create and access information in matrices - Video Tutorials & Practice Problems
Video duration:
8m
Play a video:
<v Voiceover>Matrices form the foundation</v> for much of modern math, including statistics. And are a very powerful tool for handling lots of data. They are similar to data frames in that they're rectangular and they have rows and columns. Except they're different in that an entire matrix all has to be of the same type. That is, every element either has to be a character, or more often, a numeric. So let's get started by creating a few matrices. First one we will create is a five row by two column matrix. We do that by saying, A gets matrix, that's the function for creating them. The first element is all the data that's going in. So for us that'll be one through 10. Then we specify another argument nrow equals five, and that will give us a matrix that looks like this. The numbers got put in column-wise, one, two, three, four, five. When it ran to the number of rows, started again, six, seven, eight, nine, 10. Let's create another matrix B. This'll be 21 through 30, and again have five rows. And let's create one called C. Which'll be 21 through 40. And this time have two rows. We can look at these and we see that B is a very similar shape to A. And C is now two rows by 10 columns. It's very important to keep in mind that rows and columns are very important with matrices. Similar to a data frame, we can check the number of rows and number of columns in a matrix. So let's do nrow of A, there's five. ncol of A, there's two. And dim of A gives us both. Cleaning the screen to give us some room, just like with vectors where you can perform operations element by element, the same can be done with matrices. So let's look at two of our matrices again, A and B. The first row, first column of A, to the first row, first column of B. In the second row, first column of A, to the second row, first column of B. We do this simply by saying, A plus B. And now we see the numbers where 22 was the addition of one plus 21. And so forth and so on. The same can be done with multiplication. This is a product of the first row, first column. Product of the second row, second column. All these element by element operations require data frames with the same dimensions. Five rows, two columns. Or any other set of dimensions. They need to be identical. We can even test an element by element equality between matrices. So we can do, A equals B. And it tell us in this case, none of the elements equals their corresponding elements in the other matrix. An important part of matrix algebra is taking the dot product of two matrices. Or two vectors as it may be. While the math is a little beyond the purpose of this course, taking a dot product is incredibly important in statistics, and is used so often actually you can speed up computations. The way it works is, you have a matrix on the left, and you're doing a special type of multiplication summation of a matrix on the right. In order for this to work, however, the number of columns in the matrix on the left must be same as the number of rows on the matrix on the right. Let's illustrate that with A and B. Let's look at the number of columns in A. Which is two. And the number of rows in B, which is five. They are not equal. So what we will do is, we will take the transpose of B. We do that using the T function. And it looks like this. Transposing sort of flips the matrix. By transposing B, we see that we have a two row by five column matrix. So now, we will take the dot product between A and the transpose of B. That is done using the special multiplication symbol, which is the percent sign, the asterisk, and the percent sign. And then we do that again and remember the transpose of B. That gives us a five by five matrix. The dimensions of the resulting matrix will be the number of rows of the left side matrix and the number of columns of the right side matrix. And so, we see in this case it happens to return a square matrix. We can see, if we do a dot product between A and C, that we are left with a five by 10 matrix. Because remember, C was a two by 10. Much like data frames, matrices can have column names and row names. To access these, use the functions colnames of A, which is null because we never assigned names, and rownames of A, which is also null because we never assigned anything. So if we want to, we can assign names to these. Such as, colnames of A gets, remember this takes a vector with as many elements as A has columns. We'll call it, left and right. And we will give it rownames, and we will call them, 1st, 2nd, 3rd, 4th, and 5th. And we can confirm this by looking at A, and we see now it has nicely named columns and rows. In a similar fashion, let's give some names to matrix B. So let's say, colnames of B gets first and second. And rownames of B gets, and this time we'll spell out the names. One, two, three, four, and five. And lastly, let's give C some names. There's a special variable in R called LETTERS, and that just gives the whole alphabet. Likewise if you use lowercase, that gives you lowercase alphabet. And we will use this to give column names to C. So we will say, colnames of C gets capital letters. And we will take the first through 10th element of that. And we'll say rownames of C, gets top and bottom. If we look at C, A through J for the column names, and top and bottom for the row names. Now looking back at A, we see left and right and 1st through 5th. If we take the transpose of A, the column names become row names, and the row names become column names. It gets interesting when we do the dot product between A and C. We'll do A percent star percent C. As you can see here, the row names came from the row names of A, and the column names came from the column names of C. Matrices are great for numerical computations and can speed up many processes in R, by either doing element by element operations, or by using matrix algebra such as dot products. They take on a rectangular form much like data frames, but it is important to remember that each element in a matrix must be the same type.