8.5 Use mutate to change or create columns - Video Tutorials & Practice Problems
Video duration:
2m
Play a video:
<v Voiceover>When you</v> need to alter a data frame, dplyr provides the mutate function to do just that. Let's say we want to calculate price divided by carat. We say dia pipe mutate price divided by carat. We now have this new column: price divided by carat. But perhaps instead we don't want it to be called "price divided by carat." We want it to be called "ratio" so we can specify the name in the mutate function. We say dia pipe mutate ratio equals price divided by carat. And now we have this column nicely labeled "ratio." Of course, we were just printing this. If we run dia, we see that the column's not there. Mutate doesn't actually change the data frame or the tbl unless you write back to it. We could do this the base-R way and say dia gets dia pipe to mutate. Or we could take advantage of one of the other magrittr pipes. To do so we load magrittr and we say dia, then we do percent, left angle, right angle, percent. This is like an ordinary pipe, but when it's done, it assigns the new value back to whatever's on the left-hand side. So now we say mutate ratio equals price divided by carat. And it didn't print it out because it assigned it, but if we look at dia now, it now has that ratio stored in it. A nice feature about mutate is that once you create a column with it, that column can be used in a subsequent expression. For instance, dia pipe mutate, we will say TotalSize equals x+y+z, comma, TwiceSize equals TotalSize times 2. So even though we had just created TotalSize right here, that was usable in the TwiceSize expression. Mutate allows you to easily make changes to a data frame entire columns at a time.