21.1 Get started with igraph - Video Tutorials & Practice Problems
Video duration:
8m
Play a video:
<v Voiceover>Social network analysis</v> is quickly becoming a very hot part of data science. Fortunately, R has many ways of working with networks. Perhaps one of the most popular ways is the I-graph package so go ahead and install that if you don't already have it installed and what you do, let's load it up by saying require(igraph). We're going to create simple little small graph just to see how it works. You'll say G gets graph and the graph function can create graphs in many different ways. We'll just start by entering a simple vector. The way this works is the first element points to the second, the third points to the fourth, the fifth points to the sixth, etc. So we'll say 1, 2, because the first grade points to the second and then I got to put a space here. I'm grouping these as spaces and no spaces just so it is easier to see. So now you can see that the first node goes to the second node and the first node goes to the third node and the second node goes to the third node also and lastly the third node goes to the fifth node. Close off the vector and say n=5 to say that there are 5 nodes. If I just print G we get some text based information but it's much easier to understand if we plot it. So I will say plot[G]. And now we can see we have this nice network graph where one goes to two, one goes to three. Two goes to three, three goes to five and four is disconnected from the graph because no node pointed to it and it didn't point to any node. This was a simple very trivial example. So let's go ahead and actually generate a random graph using built in functionality in igraph. So we'll say G gets graph.tree and I'll give you the arguments 40 count of 4. This specifies that we want 40 nodes and that we can have 4 children per node. Let's run this and let's go ahead and pop this node. And I will go ahead and expand this using the zoom feature in our studio and I'll adjust the plot a little and we can see we have this nice network diagram showing the structure of the graph. The layout of these graphs make a big difference to the interpretation. So fortunately, our offers many different ways to plot them. We can say plot(g, layout=layout.cicle) and we get the circle based layout. There are many others such as fruchterman.reingold. We also have graphopt which is plot(g, layout.graphopt). And you can see these are just different ways of plotting a graph and the layout can actually make a big difference in interpretation. So it is important that you use the correct layout. There's also the kamada.kawai layout which again is plot(g, layout=layout.kamada.kawai) Just many different ways to plot the graph. Working network graphs it's often very helpful to be able to move the ports around. So you can use TK plot to give us an interactive plotting plane. So we say tkplot(g, layout=layout.kamada.kawai) You see this plotting function similar arguments as the regular plotting function. If we plot that it loads the tcltk package and it opens up another window which now allows up to move points around. This can make a big difference in figuring out what's happening in your graph. It's a great feature that really helps network researchers quite a bit. Going back to R, there is yet another function called rglplots. Let's take a look at that. First, let's go ahead and save the layout to the variable L. This way it doesn't have to recomputed every time. So we say L gets layout.kamada.kawai and again I use the auto complete function and I pass it to graph. We now have the layout. We can now run rglplot. We pass it in the graph and we say the layout is the layout we previously saved. Now you can see here, we get an error message. That's because we don't have the package rgl. Even though rglplot is included in the igraph package we never installed the package it relies upon. So let's go ahead and do that now. Come over to packages, install rgl. Run that; our studio interface makes this very easy. We install that and then we should be good to go. Okay now we come to run this. It loads the package and we see that we run rglplot. It creates this external window which allows us to fly through the graph in a pseudo 3D manner. And by flying through the graph it really can give us much greater insight into the structure and what we're seeing. It could really help expand our understanding. Back to R. Let's get back to the basic plotting function and we can change the color of the points. So let's say we say plot(g layout=l, vertex.color=cyan) and that changes the color of our plot. There are all sorts of options you can specify. The edge color, the vertex color, the thickness of the lines. Lots and lots of ways you can fully customize this. And igraph is incredibly powerful. It can even do edges that can loop back on the same vertex. Let's create a simple graph to see this. So again g gets graph. Let's say one goes to one, one goes to two, one goes to three, two goes to three, four goes to five and we'll say that there are five nodes. We say print.igraph of g. We'll say full equals true. This way we could see the full mapping. We can now see that one goes to one, one goes to two. Instead of just seeing this little diagram here, we could see all the edges. Now let's plot it, plot(g). We now have a simple graph that I'll zoom in on and we can see that four goes to five, we can see here that this is a directive graph. That's why the arrows point in a certain direction. And one actually loops back on itself. You can see the loop right here. One points to two and then one points to itself. That's very important to be able to loop in on itself. Working with graphs and R can be easy if you use the right tools and igraph is a great tool to get started with.