21.4 Use centrality measures - Video Tutorials & Practice Problems
Video duration:
5m
Play a video:
<v Voiveover>Centrality</v> is a very important measure for network analysis. Two particular types are eigenvector centrality and betweenness. In using the combination of them in linear regression, you can help identify key players in a network. Let's take a look at this. To calculate eigenvector centrality, we can use the function evCent we do it on our flights network, and we are only really interested in the eigenvector itself. Here we can see the eigenvector centrality for each airport. We can also do the betweenness. So to see them together, let's go ahead and build a data frame of the two of them. We will say centrality gets data frame Betweenness equals betweenness of flights Eigen equals evCent of flights because again we only care about the eigenvector. We can then fit a regression of eigen on betweenness store the residuals in the centrality data frame. So we will say centrality dollar resid gets lm Eigen on betweenness data equals centrality dollar residuals. Let's look at the data frame now. We have three columns: betweenness, eigen and residuals. To see how these help point out key players, let's plot them. So first load ggplot2 then let's build up the plot p gets ggplot the data is centrality the aesthetic mappings are x equals betweenness, y equals Eigen label equals rownames of centrality color equals resid and size equals the absolute value of resid. Now we will just do p plus geom_text and we see here this nice plot that I will zoom in to reveal the different players. It's nice seeing it displayed this way, but sometimes you still want the same information displayed as a graph. So, let's go ahead and do that. Before preceding though, we're going to save the layout of the graph to a variable, because the layout can be the most computationally-intensive part of this process. So we don't want to do it again and again. We will say, l gets layout.fruchterman.reingold pass it flights and say niter equals 500. Just in case it doesn't converge, we stop it at a certain point. Next we'll make the size of the nodes the absolute value of the residuals times 10 so you can actually see them. So we say v of flights dollar size gets abs centrality dollar resid times 10. We'll also go through and pick out the nodes for which the absolute value of the residuals are less than point one. So we start off with nodes gets as.vector v of flights. Then we pick out the ones whose absolute value of the residuals are less than .10 so we say nodes such that which abs centrality resid is less than .10 and set those to na. We now have just the key players pointed out. Let's go ahead and plot the graph now. So we say plot flights layout equals l vertex.label equals nodes so we'll only label the nodes that we have specified. Vertex.label.dist equals point one vertex.label.color equals red and edge.width equals one. And of course it helps the correct spelling. So I plot this, and we can see our familiar network graph, but we're only showing the nodes that have this residual less than .10