Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

2. This question will provide some insight into network analysis using a subset

ID: 3735877 • Letter: 2

Question

2. This question will provide some insight into network analysis using a subset of the Internet infrastructure from July 2006. You will use the R igraph package for this question. Aside from the included R documentation a good additional web reference/tutorial is kateto.net/networks-r-igraph (a) (2 points) Read in the edgelist provided on Blackboard and plot the undirected network such that there are no vertex labels (vertex, label=NA), and each vertex is of size 1 (vertex. size-1). Use the plot layout layout.drl This should require no more than 2 lines of R code. (b) (4 points) Plot the cumulative degree distribution in a log-log scale plot such that the x-axis is ordered in increasing node degree (the nuber of edges connected to the node) This should require no more than 2 lines of R code (c) One important societal concern, especially for critical infrastructure such as the Internet, is the ability to sustain functionality due to random node failure or targeted node attacks. In both of these cases the impacted nodes are essentially deleted from the network Assuming that network functionality is related to the size of the largest connected component (the largest set of nodes connected by edges such that a sequence of edges exist that connect any two nodes in the subgraph) of the residual network (i.e., remaining network after deleting nodes that failed/were attacked). The set of nodes to be removed w be denoted as R, and the following questions will examine different strategies for constructing it . (2 points) Use the clusters function to confirm that the original network has one connected component

Explanation / Answer

While I generate many (and often very creative) errors in R, there are three simple things that will most often go wrong for me. Those include:

Capitalization. R is case sensitive - a graph vertex named “Jack” is not the same as one named “jack”. The function rowSums won’t work if spelled as rowsums or RowSums.

Object class. While many functions are willing to take anything you throw at them, some will still surprisingly require character vector or a factor instead of a numeric vector, or a matrix instead of a data frame. Functions will also occasionally return results in an unexpected formats.

Package namespaces. Occasionally problems will arise when different packages contain functions with the same name. R may warn you about this by saying something like “The following object(s) are masked from ‘package:igraph’ as you load a package. One way to deal with this is to call functions from a package explicitly using ::. For instance, if function blah() is present in packages A and B, you can call A::blah and B::blah. In other cases the problem is more complicated, and you may have to load packages in certain order, or not use them together at all. For example (and pertinent to this workshop), igraph and Statnet packages cause some problems when loaded at the same time. It is best to detach one before loading the other.

The description of an igraph object starts with up to four letters:

The two numbers that follow (7 5) refer to the number of nodes and edges in the graph. The description also lists node & edge attributes, for example:

In the following sections of the tutorial, we will work primarily with two small example data sets. Both contain data about media organizations. One involves a network of hyperlinks and mentions among news sources. The second is a network of links between media venues and consumers. While the example data used here is small, many of the ideas behind the analyses and visualizations we will generate apply to medium and large-scale networks

The first data set we are going to work with consists of two files, “Media-Example-NODES.csv” and “Media-Example-EDGES.csv” (download here).

Examine the data:

Notice that there are more links than unique from-to combinations. That means we have cases in the data where there are multiple links between the same two nodes. We will collapse all links of the same type between the same two nodes by summing their weights, using aggregate() by “from”, “to”, & “type”. We don’t use simplify() here so as not to collapse different link types.

Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Drop an Email at
drjack9650@gmail.com
Chat Now And Get Quote