13.1 Understand the folder structure and files in a package
13: Package Building
13.1 Understand the folder structure and files in a package - Video Tutorials & Practice Problems
Video duration:
5m
Play a video:
<v Voiceover>The package ecosystem</v> is one of the great things about R. It allows people, anyone to extend the capabilities of R and reach a wide audience of people. Building your own package and contributing it back to CRAN is a really rewarding feeling. And today thanks to new packages such as devtools, it's never been easier. There are a few things you need to know about a package. This first of which is the folder structure. It's very important that we get the structure right. There needs to be at minimum two folders in this structure. Needs to be an R folder, where you keep the code, and a man folder, where you keep documentation files. So we'll create a new folder. Essentially a package is just a collection of folders. We will let RStudio do the work for us. So we come here File, New Project. New Directory, and it lets us choose the type of project we want to do. Let's choose Package. There is an option for a Package with integrated C++ code, but lets start with a simple Package. We're going to call it Simple, just to be honest. And we'll put it in a folder here. We will put it under Consulting and just make its own new folder. We won't create a git repository for this just yet. Let's go ahead and click Create. It does a bunch of work and then it opens us up inside a new package. You can see here it's very familiar, except that there's a new tab called Build. The Git tab is gone because we're not using git anymore but Build is here to help you build the package. Let's go take a look at what folders were created. So in here there are a number of things. First off the is a Read-and-delete-me, you don't need to worry about that it just gives you instructions. So let's follow its advice and delete it without reading. It created the bare-minimum of two folders, the R folder for all the R code and the man folder for all the documentation, and documentation is very important. Other optional folders are the source folder and that's if you're including compiled code such as C++ or Fortran. We're not going to worry about that for now. There's also the data folder where you would store any data you want to include. The inst folder which is for files that you want to be included in the final installation. And lastly the test folder. This is for code that you use to test your function. It's always a good idea to test your functions and this is again been made very easy by the testthat package. Both packages I've mentioned so far that make package development easy, devtools and testthat were written by Hadley Wickham, the author of plyr, ggplot, lubridate, reshape2, he's very prolific. But at the minimum you need the man folder and the R folder. This is the basic structure of an R package. Once you get these folders setup you're ready to start working on the package files. At the bare-minimum you need a DESCRIPTION file and a NAMESPACE file. Let's open those up in RStudio to see what they look like. The DESCRIPTION file has a bunch of information about the package and for this we will adjust our screen because we're not going to be using the Console so much. So this file contains a number of tags. Some of them are the bare-minimum necessary, some are not. The first tag is the name of the Package, in our case Simple. And it is case-sensitive. The Type is Package and there's really only two types, the other type is for a GUI front-end to R. Don't worry about that, you're building a package. The Title is a brief description of what it does and we'll say Simple Examples. The Version is the version number that you're at. This should actually be a three-digit version. A major, a minor, then a patch. So you could iterate it 1.0.1, 1.0.2, 1.1.0, whatever you'd like. The Date is the last day it was modified. And the Author is who wrote it. Now the Maintainer has to be a person you can email and literally complain to. And they're very specific, CRAN is very detailed about what this needs to be. It needs to have a name, followed by an email address in these angled brackets. I put mine under packages@jaredlander.com. You could write a description, you know saying, this is a simple example of writing packages. And lastly, it's important to declare a license. You need to decide how you're gonna license your material. Common ones are MIT, GPL, BSD, we'll just go with BSD. It needs to be one of these valid license names or a note to say check a license file. Let's go ahead and save this. This is the bare-minimum of what we need. There are other fields we can use and we will learn more about them as we go forward. The NAMESPACE file lists the functions you are exporting. Right now it's exporting all, which isn't necessarily the thing we really want to do, but it will work for us for now. Other possible files you could have is a news file which details what has changed from one version of the package to the next. There's also a license file where you could have detailed information about the license. And a readme file which is a basic description of the package. Now that you have your folder structure and files all set, the next step is to begin coding. And the coding all goes in .R files stored in the R folder, and that's what we will learn next.