Section 7.1
Introduction

Angelina Jolie and Brad Pitt, Ben Affleck and Jennifer Garner, Harrison Ford and Calista Flockhart, Michael Douglas and Catherine Zeta-Jones, Tom Cruise and Katie Holmes, Richard Gere and Cindy Crawford (Image 7.1). An odd list, yet instantly recognizable to those immersed in the headline-driven world of celebrity couples. They are Hollywood stars that are or were married. Their weddings (and breakups) has drawn countless hours of media coverage and sold millions of gossip magazines. Thanks to them we take for granted that celebrities marry each other. We rarely pause to ask: Is this normal? In other words, what is the true chance that a celebrity marries another celebrity?

Hubs Dating Hubs.
Image 7.1
Hubs Dating Hubs
Celebrity couples, representing a highly visible proof that in social networks hubs tend to know, date and marry each other (Images from http://www.whosdatedwho.com).

Assuming that a celebrity could date anyone from a pool of about a hundred million (108) eligible individuals worldwide, the chances that their mate would be another celebrity from a generous list of 1,000 other celebrities is only 10-5. Therefore, if dating were driven by random encounters, celebrities would never marry each other.

Even if we do not care about the dating habits of celebrities, we must pause and explore what this phenomenon tells us about the structure of the social network. Celebrities, political leaders, and CEOs of major corporations tend to know an exceptionally large number of individuals and are known by even more. They are hubs. Hence celebrity dating (Image 7.1) and joint board memberships are manifestations of an interesting property of social network: hubs tend to have ties to other hubs.

As obvious this may sound, this property is not present in all networks. Consider for example the protein-interaction network of yeast, shown in Image 7.2. A quick inspection of the network reveals its scale-free nature: numerous one- and two-degree proteins coexist with a few highly connected hubs. These hubs, however, tend to avoid linking to each other. They link instead to many small-degree nodes, generating a hub-and-spoke pattern. This is particularly obvious for the two hubs highlighted in Image 7.2: they almost exclusively interact with small-degree proteins.

Hubs Avoiding Hubs.
Image 7.2
Hubs Avoiding Hubs

The protein interaction map of yeast. Each node corresponds to a protein and two proteins are linked if there is experimental evidence that they can bind to each other in the cell. We highlighted the two largest hubs, with degrees k = 56 and k′ = 13. They both connect to many small degree nodes and avoid linking to each other.

The network has N = 1,870 proteins and L = 2,277 links, representing one of the earliest protein interaction maps [1, 2]. Only the largest component is shown. Note that the protein interaction network of yeast in table 4.1 represents a later map, hence it contains more nodes and links than the network shown in this figure. Node color corresponds to the essentiality of each protein: the removal of the red nodes kills the organism, hence they are called lethal or essential proteins. In contrast the organism can survive without one of its green nodes. After [3].

A brief calculation illustrates how unusual this pattern is. Let us assume that each node chooses randomly the nodes it connects to. Therefore the probability that nodes with degrees k and k′ link to each other is

Equation (7.1) tells us that hubs, by the virtue of the many links they have, are much more likely to connect to each other than to small degree nodes. Indeed, if k and k′ are large, so is pk,k’ . Consequently, the likelihood that hubs with degrees k=56 and k’ = 13 have a direct link between them is pk,k’ = 0.16, which is 400 times larger than p1,2 = 0.0004, the likelihood that a degree-two node links to a degree-one node. Yet, there are no direct links between the hubs in Image 7.2, but we observe numerous direct links between small degree nodes.

Instead of linking to each other, the hubs highlighted in Image 7.2 almost exclusively connect to degree one nodes. By itself this is not unexpected: We expect that a hub with degree k = 56 should link to N1p1, 56 ≈ 12 nodes with k = 1. The problem is that this hub connects to 46 degree one neighbors, i.e. four times the expected number.

In summary, while in social networks hubs tend to “date” each other, in the protein interaction network the opposite is true: The hubs avoid linking to other hubs, connecting instead to many small degree nodes. While it is dangerous to derive generic principles from two examples, the purpose of this chapter is to show that these patterns are manifestations of a general property of real networks: they exhibit a phenomena called degree correlations. We discuss how to measure degree correlations and explore their impact on the network topology.

Section 7.2
Assortativity and Disassortativity

Just by the virtue of the many links they have, hubs are expected to link to each other. In some networks they do, in others they don’t. This is illustrated in Image 7.3, that shows three networks with identical degree sequences but different topologies:

In general a network displays degree correlations if the number of links between the high and low-degree nodes is systematically different from what is expected by chance. In other words, the number of links between nodes of degrees k and k′ deviates from (7.1).

Degree Correlation Matrix.
Image 7.3
Degree Correlation Matrix

(a,b,c) Three networks that have precisely the same degree distribution (Poisson pk), but display different degree correlations. We show only the largest component and we highlight in orange the five highest degree nodes and the direct links between them.

(d,e,f) The degree correlation matrix eij for an assortative (d), a neutral (e) and a disassortative network (f) with Poisson degree distribution, N=1,000, and 〈k〉=10. The colors correspond to the probability that a randomly selected link connects nodes with degrees k1 and k2.

(a,d) Assortative Networks
For assortative networks eij is high along the main diagonal. This indicates that nodes of comparable degree tend to link to each other: small-degree nodes to small-degree nodes and hubs to hubs. Indeed, the network in (a) has numerous links between its hubs as well as between its small degree nodes.

(b,e) Neutral Networks
In neutral networks nodes link to each other randomly. Hence the density of links is symmetric around the average degree, indicating the lack of correlations in the linking pattern.

(c,f) Disassortative Networks
In disassortative networks eij is higher along the secondary diagonal, indicating that hubs tend to connect to small-degree nodes and small-degree nodes to hubs. Consequently these networks have a hub and spoke character, as seen in (c).

The information about potential degree correlations is captured by the degree correlation matrix, eij, which is the probability of finding a node with degrees i and j at the two ends of a randomly selected link. As eij is a probability, it is normalized, i.e.

In (5.27) we derived the probability qk that there is a degree-k node at the end of the randomly selected link, obtaining

We can connect qk to eij via

In neutral networks, we expect

A network displays degree correlations if eij deviates from the random expectation (7.5). Note that (7.2) - (7.5) are valid for networks with an arbitrary degree distribution, hence they apply to both random and scale-free networks.

Perfect Assortativity.
Image 7.4
Perfect Assortativity
In a perfectly assortative network each node links only to nodes with the same degree. Hence ejk = δjkqk, where δjk is the Kronecker delta. In this case all non-diagonal elements of the ejk matrix are zero. The figure shows such a perfectly assortative network, consisting of complete k-cliques.

Given that eij encodes all information about potential degree correlations, we start with its visual inspection. Figures 7.3d,e,f show eij for an assortative, a neutral and a disassortative network. In a neutral network small and high-degree nodes connect to each other randomly, hence eij lacks any trend (Image 7.3e). In contrast, assortative networks show high correlations along the main diagonal, indicating that nodes predominantly connect to nodes with comparable degree. Therefore low-degree nodes tend to link to other low-degree nodes and hubs to hubs (Image 7.3d). In disassortative networks eij displays the opposite trend: it has high correlations along the secondary diagonal. Therefore high-degree nodes tend to connect to low-degree nodes (Image 7.3f).

In summary information about degree correlations is carried by the degree correlation matrix eij. Yet, the study of degree correlations through the inspection of eij has numerous disadvantages:

We therefore need to develop a more compact way to detect degree correlations. This is the goal of the subsequent sections.

Section 7.3
Measuring Degree Correlations

While eij contains the complete information about the degree correlations characterizing a particular network, it is difficult to interpret its content. In this section is to introduce the degree correlation function that offers a simpler way to quantify degree correlations.

Degree correlations capture the relationship between the degrees of nodes that link to each other. One way to quantify their magnitude is to measure for each node i the average degree of its neighbors (Image 7.5)

The degree correlation function calculates (7.6) for all nodes with degree k [4, 5]

where P(k’|k) is the conditional probability that following a link of a k-degree node we reach a degree-k' node. Therefore knn(k) is the average degree of the neighbors of all degree-k nodes.To quantify degree correlations we inspect the dependence of knn(k) on k.

Degree Correlation Function.
Image 7.6
Degree Correlation Function
The degree correlation function knn(k) for three real networks. The panels show knn(k) on a loglog plot to test the validity of the scaling law (7.10).
  1. Collaboration Network
    The increasing knn(k) with k indicates that the network is assortative.
  2. Power Grid
    The horizontal knn(k) indicates the lack of degree correlations, in line with (7.9) for neutral networks.
  3. Metabolic Network
    The decreasing knn(k) documents the network’s disassortative nature.

On each panel the horizontal line corresponds to the prediction (7.9) and the green dashed line is a fit to (7.10).

The behavior observed in Image 7.6 prompts us to approximate the degree correlation function with [4]

If the scaling (7.10) holds, then the nature of degree correlations is determined by the sign of the correlation exponent μ:

In summary, the degree correlation function helps us capture the presence or absence of correlations in real networks. The knn(k) function also plays an important role in analytical calculations, allowing us to predict the impact of degree correlations on various network characteristics (SECTION 7.6). Yet, it is often convenient to use a single number to capture the magnitude of correlations present in a network. This can be achieved either through the correlation exponent μ defined in (7.10), or using the degree correlation coefficient introduced in BOX 7.2.

Section 7.4
Structural Cutoffs

Throughout this book we assumed that networks are simple, meaning that there is at most one link between two nodes (Figure 2.17). For example, in the email network we place a single link between two individuals that are in email contact, despite the fact that they may have exchanged multiple messages. Similarly, in the actor network we connect two actors with a single link if they acted in the same movie, independent of the number of joint movies. All datasets discussed in Table 4.1 are simple networks.

In simple networks there is a puzzling conflict between the scale-free property and degree correlations [10, 11]. Consider for example the scalefree network of Image 7.7a, whose two largest hubs have degrees k = 55 and k' = 46. In a network with degree correlations ekk' the expected number of links between k and k' is

For a neutral network ekk' is given by (7.5), which, using (7.3), predicts

Therefore, given the size of these two hubs, they should be connected to each other by two to three links to comply with the network’s neutral nature. Yet, in a simple network we can have only one link between them, causing a conflict between degree correlations and the scale-free property. The goal of this section is to understand the origin and the consequences of this conflict.

For small k and k' (7.14) predicts that Ekk’ is also small, i.e. we expect less than one link between the two nodes. Only for nodes whose degree exceeds some threshold ks does (7.14) predict multiple links. As we show in ADVANCED TOPICS 7.B, ks, called structural cutoff, scales as

In other words, nodes whose degree exceeds (7.15) have Ekk’ > 1, a conflict that as we show below gives rise to degree correlations.

Structural Disassortativity.
Image 7.7
Structural Disassortativity
  1. A scale-free network with N=300, L=450, and γ=2.2, generated by the configuration model (Figure 4.15). By forbidding self-loops and multi-links, we made the network simple. We highlight the two largest nodes in the network. As (7.14) predicts, to maintain the network’s neutral nature, we need approximately three links between these two nodes. The fact that we do not allow multilinks (simple network representation) makes the network disassortative, a phenomena called structural disassortativity.
  2. To illustrate the origins of structural correlations we start from a fixed degree sequence, shown as individual stubs on the left. Next we randomly connect the stubs (configuration model). In this case the expected number of links between the nodes with degree 8 and 7 is 8x7/28 ≈ 2. Yet, if we do not allow multi-links, there can only be one link between these two nodes, making the network structurally disassortative.

In other words, nodes whose degree exceeds (7.15) have Ekk’ > 1, a conflict that as we show below gives rise to degree correlations.

To understand the consequences of the structural cutoff we must first ask if a network has nodes whose degrees exceeds (7.15). For this we compare the structural cutoff, ks, with the natural cutoff, kmax, which is the expected largest degree in a network. According to (4.18), for a scale-free network kmax ~ N1/γ-1 . Comparing kmax to ks allows us to distinguish two regimes:

Natural and Structural Cutoffs.
Image 7.8
Natural and Structural Cutoffs

The figure illustrates the tension between the scale-free property and degree correlations. We show the degree distribution (left panels) and the degree correlation function knn(k) (right panels) of a scale-free network with N = 10,000 and γ = 2.5, generated by the configuration model (Image 4.15).

(a,b) If we generate a scale-free network with the power-law degree distribution shown in (a), and we forbid self-loops and multilinks, the network displays structural disassortativity, as indicated by knn(k) in (b). In this case, we lack a sufficient number of links between the high-degree nodes to maintain the neutral nature of the network, hence for high k the knn(k) function must decay.

(c,d) We can eliminate structural disassortativity by relaxing the simple network requirement, i.e. allowing multiple links between two nodes. As shown in (c,d), in this case we obtain a neutral scale-free network.

(e,f) If we impose an upper cutoff by removing all nodes with kks ≃ 100, as predicted by (7.15), the network becomes neutral, as seen in (f).

We have two avenues to generate networks that are free of structural disassortativity:

  1. We can relax the simple network requirement, allowing multiple links between the nodes. The conflict disappears and the network will be neutral (Image 7.8c,d).
  2. If we insist having a simple scale-free network that is neutral or assortative, we must remove all hubs with degrees larger than ks. This is illustrated in Image 7.8e,f: a network that lacks nodes with k ≥ 100 is neutral.

Finally, how can we decide whether the correlations observed in a particular network are a consequence of structural disassortativity, or are generated by some unknown process that leads to degree correlations? Degree-preserving randomization (Image 4.17) helps us distinguish these two possibilities:

  1. Degree Preserving Randomization with Simple Links (R-S)
    We apply degree-preserving randomization to the original network and at each step we make sure that we do not permit more than one link between a pair of nodes. On the algorithmic side this means that each rewiring that generates multi-links is discarded. If the real knn(k) and the randomized knnR−S(k) are indistinguishable, then the correlations observed in a real system are all structural, fully explained by the degree distribution. If the randomized knn knnR−S(k) does not show degree correlations while knn(k) does, there is some unknown process that generates the observed degree correlations.
  2. Degree Preserving Randomization with Multiple Links (R-M)
    For a self-consistency check it is sometimes useful to perform degree-preserving randomization that allows for multiple links between the nodes. On the algorithmic side this means that we allow each random rewiring, even if it leads to multi-links. This process eliminates all degree correlations.

We performed the randomizations discussed above for three real networks. As Image 7.9a shows, the assortative nature of the scientific collaboration network disappears under both randomizations. This indicates that the assortative correlations of the collaboration network is not linked to its scale-free nature. In contrast, for the metabolic network the observed disassortativity remains unchanged under R-S (Image 7.9c). Consequently the disassortativity of the metabolic network is structural, being induced by its degree distribution.

Randomization and Degree Correlations.
Image 7.9
Randomization and Degree Correlations

To uncover the origin of the observed degree correlations, we must compare knn(k) (grey symbols), with knnR-S(k) and knnR-M(k) obtained after degree-preserving randomization. Two degree- preserving randomizations are informative in this context:

Randomization with Simple Links (R-S):
At each step of the randomization process we check that we do not have more than one link between any node pairs.

Randomization with Multiple Links (R-M):
We allow multi-links during the randomization processes.

We performed these two randomizations for the networks of Image 7.6. The R-M procedure always generates a neutral network, consequently knnR-M(k) is always horizontal. The true insight is obtained when we compare knn(k) with knnR-S(k), helping us to decide if the observed correlations are structural:

  1. Scientific Collaboration Network
    The increasing knn(k) differs from the horizontal knnR-S(k), indicating that the network’s assortativity is not structural. Consequently the assortativity is generated by some process that governs the network’s evolution. This is not unexpected: structural effects can generate only disassortativity, not assortativity.
  2. Power Grid
    The horizontal knn(k), knnR-S(k) and knnR-M(k) all support the lack of degree correlations (neutral network).
  3. Metabolic Network
    As both knn(k) and knnR-S(k) decrease, we conclude that the network’s disassortativity is induced by its scale-free property. Hence the observed degree correlations are structural.

In summary, the scale-free property can induce disassortativity in simple networks. Indeed, in neutral or assortative networks we expect multiple links between the hubs. If multiple links are forbidden (simple graph), the network will display disassortative tendencies. This conflict vanishes for scale-free networks with γ ≥ 3 and for random networks. It also vanishes if we allow multiple links between the nodes.

Section 7.5
Correlations in Real Networks

To understand the prevalence of degree correlations we need to inspect the correlations characterizing real networks. In Image 7.10 we show the knn(k) function for the ten reference networks, observing several patterns:

Randomization and Degree Correlations.
Image 7.10
Randomization and Degree Correlations
The degree correlation function knn(k) for the ten reference networks (Table 4.1). The grey symbols show the knn(k) function using linear binning; purple circles represent the same data using log-binning (SECTION 4.11). The green dotted line corresponds to the best fit to (7.10) within the fitting interval marked by the arrows at the bottom. Orange squares represent knnR-S(k) obtained for 100 independent degree-preserving randomizations, while maintaining the simple character of these networks. Note that we made directed networks undirected when we measured knn(k). To fully characterize the correlations emerging in directed networks we must use the directed correlation function (BOX 7.3).

In summary, Image 7.10 indicates that to understand degree correlations, we must always compare knn(k) to the degree randomized knnR-S(k). It also allows us to draw some interesting conclusions:

  1. Of the ten reference networks the power grid is the only truly neutral network. Hence most real networks display degree correlations.
  2. All networks that display disassortative tendencies (email, protein, metabolic) do so thanks to their scale-free property. Hence, these are all structurally disassortative. Only the WWW shows disassortative correlations that are only partially explained by its degree distribution.
  3. The degree correlations characterizing assortative networks are not explained by their degree distribution. Most social networks (mobile phone calls, scientific collaboration, actor network) are in this class and so is the Internet and the citation network

A number of mechanisms have been proposed to explain the origin of the observed assortativity. For example, the tendency of individuals to form communities, the topic of CHAPTER 9, can induce assortative correlations [12]. Similarly, the society has endless mechanisms, from professional committees to TV shows, to bring hubs together, enhancing the assortative nature of social and professional networks. Finally, homophily, a well documented social phenomena [13], indicates that individuals tend to associate with other individuals of similar background and characteristics, hence individuals with comparable degree tend to know each other. This degree-homophily may be responsible for the celebrity marriages as well (Image 7.1).

Section 7.6
Generating Correlated Networks

To explore the impact of degree correlations on various network characteristics we must first understand the correlations characterizing the network models discussed thus far. It is equally important to develop algorithm that can generate networks with tunable correlations. As we show in this section, given the conflict between the scale-free property and degree correlations, this is not a trivial task.

Degree Correlations in Static Models

Erdős-Rényi Model
The random network model is neutral by definition. As it lacks hubs, it does not develop structural correlations either. Hence for the Erdős-Rényi network knn(k) is given by (7.9), predicting μ = 0 for any 〈k〉 and N.

Configuration Model
The configuration model (Image 4.15) is also neutral, independent of the choice of the degree distribution pk. This is because the model allows for both multi-links and self-loops. Consequently, any conflicts caused by the hubs are resolved by the multiple links between them. If, however, we force the network to be simple, then the generated network will develop structural disassortativity (Image 7.8).

Hidden Parameter Model
In the model ejk is proportional to the product of the randomly chosen hidden variables ηj and ηk (Image 4.18). Consequently the network is technically uncorrelated. However, if we do not allow multi-links, for scalefree networks we again observe structural disassortativity. Analytical calculations indicate that in this case [18]

i.e. the degree correlation function follows (7.10) with μ = − 1.

Taken together, the static models explored so far generate either neutral networks, or networks characterized by structural disassortativity following (7.16).

Degree Correlations in Evolving Networks

To understand the emergence (or the absence) of degree correlations in growing networks, we start with the initial attractiveness model (SECTION 6.5), which includes as a special case the Barabási-Albert model

Initial Attractiveness Model
Consider a growing network in which preferential attachment follows (6.23), i.e. Π(k) ~ A + k, where A is the initial attractiveness. Depending on the value of A, we observe three distinct scaling regimes [15]:

  1. Disassortative Regime: γ < 3
    If − m < A < 0 we have Hence the resulting network is disassortative, knn(k) decaying following the power-law [15, 16]
  2. Neutral Regime: γ = 3
    If A = 0 the initial attractiveness model reduces to the Barabási-Albert model. In this case Consequently knn(k) is independent of k, hence the network is neutral.
  3. Weak Assortativity: γ > 3
    If A > 0 the calculations predict As knn(k) increases logarithmically with k, the resulting network displays a weak assortative tendency, but does not follow (7.10).

In summary, (7.17) - (7.20) indicate that the initial attractiveness model generates rather complex degree correlations, from disassortativity to weak assortativity. Equation (7.19) also shows that the network generated by the Barabási-Albert model is neutral. Finally, (7.17) predicts a power law k-dependence for knn(k), offering analytical support for the empirical scaling (7.10).

Bianconi-Barabási Model
With a uniform fitness distribution the Bianconi-Barabási model generates a disassortative network [5] (Image 7.12). The fact that the randomized version of the network is also disassortative indicates that the model's disassortativity is structural. Note, however, that the real knn(k) and the randomized knnR-S(k) do not overlap, indicating that the disassortativity of the model is not fully explained by its scale-free nature.

Correlations in the Bianconi-Barabási Model.
Image 7.12
Correlations in the Bianconi-Barabási Model
The degree correlation function of the Bianconi-Barabási model for N = 10,000, m = 3 and uniform fitness distribution (SECTION 6.2). As the green dotted line indicates, follwing (7.10) indicates, the network is disassortative, consistent with μ ≃ -0.5. The orange symbols correspond to knnR-S(k). As knnR-S(k) also decreases, the bulk of the observed disassortativity is structural. But the difference between knnR-S(k) and knn(k) suggests that structural effects cannot fully account for the observed degree correlation.

Tuning Degree Correlations

Several algorithms can generate networks with desired degree correlations [8, 17, 18]. Next we discuss a simplified version of the algorithm proposed by Xalvi-Brunet and Sokolov that aims to generate maximally correlated networks with a predefined degree sequence [19, 20, 21]. It consists of the following steps (Image 7.13a):

Xulvi-Brunet & Sokolov Algorithm.
Image 7.13
Xulvi-Brunet & Sokolov Algorithm

The algorithm generates networks with maximal degree correlations.

(a) The basic steps of the algorithm.

(b) knn(k) for networks generated by the algorithm for a scale-free network with N = 1,000, L = 2,500, γ = 3.0.

(c, d) A typical network configuration and the corresponding Aij matrix for the maximally assortative network generated by the algorithm, where the rows and columns of Aij were ordered according to increasing node degrees k.

(e,f) Same as in (c,d) for a maximally disassortative network.

The Aij matrices (d) and (f) capture the inner regularity of networks with maximal correlations, consisting of blocks of nodes that connect to nodes with similar degree in (d) and of blocks of nodes that connect to nodes with rather different degrees in (f).

By iterating these steps we gradually enhance the network’s assortative (Step 2A) or disassortative (Step 2B) features. If we aim to generate a simple network (free of multi-links), after Step 2 we check whether the particular rewiring leads to multi-links. If it does, we reject it, returning to Step 1.

The correlations characterizing the networks generated by this algorithm converge to the maximal (assortative) or minimal (disassortative) value that we can reach for the given degree sequence (Image 7.13b). The model has no difficulty creating disassortative correlations (Image 7.13e,f). In the assortative limit simple networks display a mixed knn(k): assortative for small k and disassortative for high k (Image 7.13b). This is a consequence of structural cutoffs: For scale-free networks the system is unable to sustain assortativity for high k. The observed behavior is reminiscent of the knn(k) function of citation networks (Image 7.10j).

The version of the Xalvi-Brunet & Sokolov algorithm introduced in Image 7.13 generates maximally assortative or disassortative networks. We can tune the magnitude of the generated degree correlations if we use the algorithm discussed in Image 7.14.

In summary, static models, like the configuration or hidden parameter model, are neutral if we allow multi-links, and develop structural disassortativity if we force them to generate simple networks. To generate networks with tunable correlations, we can use for example the Xalve-Brunet & Sokolov algorithm. An important result of this section is (7.16) and (7.18), offering the analytical form of the degree correlation function for the hidden paramenter model and for a growing network, in both case predicting a power-law k-dependence. These results offer analytical backing for the scaling hypothesis (7.10), indicating that both structural and dynamical effects can result in a degree correlation function that follows a power law.

Tuning Degree Correlations.
Image 7.14
Tuning Degree Correlations

We can use the Xalvi-Brunet & Sokolov algorithm to tune the magnitude of degree correlations.

  1. We execute the deterministic rewiring step with probability p, and with probability 1 − p we randomly pair the a, b, c, d nodes with each other. For p = 1 we are back to the algorithm of Image 7.13, generating maximal degree correlations; for p < 1 the induced noise tunes the magnitude of the effect.
  2. Typical network configurations generated for p = 0.5.
  3. The knn(k) functions for various p values for a network with N = 10,000, 〈k〉 = 1, and γ = 3.0.

Note that the correlation exponent μ depends on the fitting region, especially in the assortative case.

Section 7.7
The Impact of Degree Correlations

As we have seen in Image 7.10, most real networks are characterized by some degree correlations. Social networks are assortative; biological networks display structural disassortativity. These correlations raise an important question: Why do we care? In other words, do degree correlations alter the properties of a network? And which network properties do they influence? This section addresses these important questions.

An important property of a random network is the emergence of a phase transition at 〈k〉 = 1, marking the appearance of the giant component (SECTION 3.6). Image 7.15 shows the relative size of the giant component for networks with different degree correlations, documenting several patterns [8, 19, 20]:

Degree Correlations and the Phase Transition Point.
Image 7.15
Degree Correlations and the Phase Transition Point
Relative size of the giant component for an Erdős-Rényi network of size N=10,000 (green curve), which is then rewired using the Xalvi-Brunet & Sokolov algorithm with p = 0.5 to induce degree correlations (Image 7.14). The figure indicates that as we move from assortative to disassortative networks, the phase transition point is delayed and the size of the giant component increases for large 〈k〉. Each point represents an average over 10 independent runs.

These changes in the size and the structure of the giant component have implications to the spread of diseases [22, 23, 24], the topic of CHAPTER 10. Indeed, as we have seen in Image 7.10, social networks tend to be assortative. The high degree nodes therefore form a giant component that acts as a “reservoir” for the disease, sustaining an epidemic even when on average the network is not sufficiently dense for the virus to persist.

The altered giant component has implications for network robustness as well [25]. As we discuss in CHAPTER 8, the removal of a network's hubs fragments a network. In assortative networks hub removal makes less damage because the hubs form a core group, hence many of them are redundant. Hub removal is more damaging in disassortative networks, as in these the hubs connect to many small-degree nodes, which fall off the network once a hub is deleted.

Let us mention a few additional consequences of degree correlations:

Degree Correlations and Path Lengths.
Image 7.16
Degree Correlations and Path Lengths
Distance distribution for a random network with size N = 10, 000 and 〈k〉 = 3. Correlations are induced using the Xalvi-Brunet & Sokolov algorithm with p = 0.5 (Image 7.14). The plots show that as we move from disassortative to assortative networks, the average path length decreases, indicated by the gradual move of the peaks to the left. At the same time the diameter, dmax, grows. Each curve represents an average over 10 independent networks.

In summary, degree correlations are not only of academic interest, but they influence numerous network characteristics and have a discernable impact on many processes that take place on a network.

Section 7.8
Summary

Degree correlations were first discovered in 2001 in the context of the Internet by Romualdo Pastor-Satorras, Alexei Vazquez, and Alessandro Vespignani [4, 5], who also introduced the degree correlation function knn(k) and the scaling (7.10). A year later Kim Sneppen and Sergey Maslov used the full p(ki,kj), related to the eij matrix, to characterize the degree correlations of protein interaction networks [32]. In 2003 Mark Newman introduced the degree correlation coefficient [8, 9] together with the assortative, neutral, and disassortative distinction. These terms have their roots in social sciences [13]:

Assortative mating reflects the tendency of individuals to date or marry individuals that are similar to them. For example, low-income individuals marry low-income individuals and college graduates marry college graduates. Network theory uses assortativity in the same spirit, capturing the degree-based similarities between nodes: In assortative networks hubs tend to connect to other hubs and small-degree nodes to other small-degree nodes. In a network environment we can also encounter the traditional assortativity, when nodes of similar properties link to each other (Image 7.18).

Disassortative mixing, when individuals link to individuals wo are unlike them, is also common in some social and economic systems. Sexual networks are perhaps the best example, as most sexual relationships are between individuals of different gender. In economic settings trade typically takes place between individuals of different skills: the baker does not sell bread to other bakers, and the shoemaker rarely fixes other shoemaker's shoes.

Taken together, there are several reasons why we care about degree correlations in networks (BOX 7.5):

Politics is Never Neutralr.
Image 7.18
Politics is Never Neutral
The network behind the US political blogosphere illustrates the presence of assortative mixing, as used in sociology, meaning that nodes of similar characteristics tend to link to each other. In the map each blue node corresponds to liberal blog and red nodes are conservative. Blue links connect liberal blogs, red links connect conservative blogs, yellow links go from liberal to conservative, and purple from conservative to liberal. As the image indicates, very few blogs link across the political divide, demonstrating the strong assortativity of the political blogosphere.
After [33].

Despite the considerable effort devoted to characterizing degree correlations, our understanding of the phenomena remains incomplete. For example, while in SECTION 7.6 we offered an algorithm to tune degree correlations, the problem is far from being fully resolved. Indeed, the most accurate description of a network's degree correlations is contained in the eij matrix. Generating networks with an arbitrary eij remains a difficult task.

Finally, in this chapter we focused on the knn(k) function, which captures two-point correlations. In principle higher order correlations are also present in some networks (BOX 7.6). The impact of such three or four point correlations remains to be understood.

Section 7.9
Homework

  1. Detailed Balance for Degree Correlations

    Express the joint probability ekk', the conditional probability P(k'|k) and the probability qk, discussed in this chapter, in terms of number of nodes N, average degree 〈k〉, number of nodes with degree k, Nk, and the number of links connecting nodes of degree k and k', Ekk' (note that Ekk' is twice the number of links when k = k'). Based on these expressions, show that for any network we have

  2. Star Network

    Consider a star network, where a single node is connected to N – 1 degree one nodes. Assume that N≫1.

    1. What is the degree distribution pk of this network?
    2. What is the probability qk that moving along a randomly chosen link we find at its end a node with degree k?
    3. Calculate the degree correlation coefficient r for this network. Use the expressions of ekk' and P(k'|k) calculated in HOMEWORK 7.1.
    4. Is this network assortative or disassortative? Explain why.

  3. Structural Cutoffs

    Calculate the structural cutoff ks for the undirected networks listed in Table 4.1. Based on the plots in Image 7.10, predict for each network whether ks is larger or smaller than the maximum expected degree kmax. Confirm your prediction by calculating kmax.

  4. Degree Correlations in Erdős-Rényi Networks

    Consider the Erdős-Rényi G(N,L) model of random networks, introduced in CHAPTER 2 (BOX 3.1 and SECTION 3.2), where N labeled nodes are connected with L randomly placed links. In this model, the probability that there is a link connecting nodes i and j depends on the existence of a link between nodes l and s.

    1. Write the probability that there is a link between i and j, eij and the probability that there is a link between i and j conditional on the existence of a link between l and s.
    2. What is the ratio of such two probabilities for small networks? And for large networks?
    3. What do you obtain for the quantities discussed in (a) and (b) if you use the Erdős-Rényi G(N,p) model?

    Based on the results found for (a)-(c) discuss the implications of using the G(N,L) model instead of the G(N,p) model for generating random networks with small number of nodes.

Section 7.10
Advanced Topic 7.A
Degree Correlation Coefficient

In BOX 7.2 we defined the degree correlation coefficient r as an alternative measure of degree correlations [8, 9]. The use of a single number to characterize degree correlations is attractive, as it offers a way to compare the correlations observed in networks of different nature and size. Yet, to effectively use r we must be aware of its origin.

The hypothesis behind the correlation coefficient r implies that the knn(k) function can be approximated by the linear function

This is different from the scaling (7.10), which assumes a power law dependence on k. Equation (7.21) raises several issues:

Network N r μ
Internet 192,244 0.02 0.56
WWW 325,729 -0.05 -1.11
Power Grid 4,941 0.003 0.0
Mobile Phone Calls 36,595 0.21 0.33
Email 57,194 -0.08 -0.74
Science Collaboration 23,133 0.13 0.16
Actor Network 702,388 0.31 0.34
Citation Network 449,673 -0.02 -0.18
E. Coli Metabolism 1,039 -0.25 -0.76
Protein Interactions 2,018 0.04 -0.1
Table 7.1
Degree Correlations in Reference Networks
The table shows the estimated r and μ for the ten reference networks. Directed networks were made undirected to measure r and μ. Alternatively, we can use the directed correlation coefficient to characterize such directed networks (BOX 7.8).ures.
Degree Correlation Function.
Image 7.19
Degree Correlation Function
The degree correlation function knn(k) for three real networks. The left panels show the cumulative function knn(k) on a log-log plot to test the validity of (7.10). The right panels show knn(k) on a lin−lin plot to test the validity of (7.21), i.e. the assumption that knn(k) depends linearly on k. This is the hypothesis behind the correlation coefficient r. The slope of the dotted line corresponds to the correlation coefficient r. As the lin-lin plots on the right illustrate, (7.21) offers a poor fit for both assortative and disassortative networks.

Relationship Between μ and r

On the positive side, r and μ are not independent of each other. To show this we calculated r and μ for the ten reference networks (TABLE 7.1). The results are plotted in Image 7.20, indicating that μ and r correlate for positive r. Note, however, that this correlation breaks down for negative r. To understand the origin of this behavior, next we derive a direct relationship between μ and r. To be specific we assume the validity of (7.10) and determine the value of r for a network with correlation exponent μ.

We start by determining a from (7.10). We can write the second moment of the degree distribution as

which leads to

We now calculate r for a network with a given μ:

For μ = 0 the term in the last parenthesis vanishes, obtaining r = 0. Hence if μ = 0 (neutral network), the network will be neutral based on r as well. For k > 1 (7.22) suggests that for μ > 0 the parenthesis is positive, hence r > 0, and for μ < 0 the parenthesis is negative, hence r < 0. Therefore r and μ predict degree correlations of similar kind.

Correlation Between r and N.
Image 7.20
Correlation Between r and N
To illustrate the relationship between r and μ, we estimated μ by fitting the knn(k) function to (7.10), whether or not the power law scaling was statistically significant.

In summary, if the degree correlation function follows (7.10), then the sign of the degree correlation exponent μ will determine the sign of the coefficient r:

Directed Networks

To measure correlations in directed networks we must take into account that each node i is characterized by an incoming kiin and an outgoing em>kiout degree. We therefore define four degree correlation coefficients, rin,in, rin,out, rout,in, rout,out, capturing all possible combinations between the incoming and outgoing degrees of two connected nodes (Image 7.21a-d). Formally we have [14]

where α and β refer to the in and out indices and q←jα in the probability of finding a node with α-degree j by following a random link backward and q→kβ in the probability of finding a β-link with degree k by following a random link forward. σα and σβ are the corresponding standard deviations. To illustrate the use of (7.23), in Image 7.21e we show the four correlation coefficients for the five directed reference networks (TABLE 7.1). Note, however, that for a complete characterization of degree correlations it is desirable to measure the four knn(k) functions as well (BOX 7.3)

In summary, the degree correlation coefficient assumes that knn(k) scales linearly with k, a hypothesis that lacks numerical and analytical support. Analytical calculations predict the power-law form (7.10) or the weaker logarithmic dependence (7.20). Yet, in general the sign of r and μ do agree. Consequently, we can use r to get a quick sense of the nature of the potential correlations present in a network. Yet, the accurate characterization of the underlying degree correlations requires us to measure knn(k).

Directed Correlation.
Image 7.21
Directed Correlation

(a)-(d) The purple and green links indicate the α, β indices that define the appropriate correlation coefficient for a directed network.

(e) The correlation profile of the five directed networks. While citation networks have negligible correlations, all four correlation coefficients document strong assortative behavior for mobile phone calls and strong disassortative behavior for metabolic networks. The case of the WWW is interesting: while three of its correlation coefficients are close to zero, there is a strong assortative tendency for the in-out degree combination.

Section 7.11
Advanced Topic 7.B
Structural Cutoffs

As discussed in SECTION 7.4, the fundamental conflict between the scalefree property and degree correlations leads to a structural cutoff in simple networks. In this section we derive (7.15), calculating how the structural cutoff depends on the system size N [11].

We start by defining

where Ekk′ is the number of links between nodes of degrees k and k’ for kk’ and twice the number of connecting links for k=k’, and

is the largest possible value of Ekk′. The origin of (7.25) is explained in Image 7.22. Consequently, we can write rkk’ as

As mkk’ is the maximum of Ekk’, we must have rkk’ ≤ 1 for any k and k’. Strictly speaking, in simple networks degree pairs for which rkk’ > 1 cannot exist. Yet, for some networks and for some k, k’ pairs rkk’ is larger than one. This is clearly non-physical and signals some conflict in the network configuration. Hence, we define the structural cutoff ks as the solution of the equation

Note that as soon as k > Npk’ and k’ > Npk , the effects of the restriction on the multiple links are felt, turning the expression for rkk′ into

For scale-free networks these conditions are fulfilled in the region k, k’ > (aN)1/(γ+1), where a is a constant that depends on pk. Note that this value is below the natural cutoff. Consequently this scaling provides a lower bound for the structural cutoff, in the sense that whenever the cutoff of the degree distribution falls below this limit, the condition rkk’ < 1 is always satisfied.

For neutral networks the joint distribution factorizes as

Hence, the ratio (7.28) becomes

Therefore, the structural cutoff needed to preserve the condition rkk’ ≤ 1 has the form [11, 34, 35, 36]

which is (7.15). Note that (7.31) is independent of the degree distribution of the underlying network. Consequently, for a scale-free network ks(N) is independent of the degree exponent γ.

Calculating mkk'.
Image 7.22
Calculating mkk'

The maximum number of links one can have between two groups. The figure shows two groups of nodes, with degree k=3 and k’=2. The total number of links between these two groups must not exceed:

  1. The total number of links available in k=3 group, which is kNk=9.
  2. The total number of links available in k’=2 group, which is k’Nk’=8.
  3. The total number of links one can potentially place between the two groups, which is NkNk’.

In the example shown above the smallest of the three is k’Nk'= 8 of (b).

Section 7.12
Bibliography

[1] P. Uetz, L. Giot, G. Cagney, T. A. Mansfield, RS Judson, JR Knight, D. Lockshon, V. Narayan, M. Srinivasan, P. Pochart, A. Qureshi-Emili, Y. Li, B. Godwin, D. Conover, T. Kalbfleisch, G. Vijayadamodar, M. Yang, M. Johnston, S. Fields, J. M. Rothberg. A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae. Nature, 403: 623–627, 2000.

[2] I. Xenarios, D. W. Rice, L. Salwinski, M. K. Baron, E. M. Marcotte, D. Eisenberg. DIP: the database of interacting proteins. Nucleic Acids Res., 28: 289–29, 2000.

[3] H. Jeong, S.P. Mason, A.-L. Barabási, and Z.N. Oltvai. Lethality and centrality in protein networks. Nature, 411: 41-42, 2001.

[4] R. Pastor-Satorras, A. Vázquez, and A. Vespignani. Dynamical and correlation properties of the Internet. Phys. Rev. Lett., 87: 258701, 2001.

[5] A. Vazquez, R. Pastor-Satorras, and A. Vespignani. Large-scale topological and dynamical properties of Internet. Phys. Rev., E 65: 066130, 2002.

[6] S.L. Feld. Why your friends have more friends than you do. American Journal of Sociology, 96: 1464–1477, 1991.

[7] E.W. Zuckerman and J.T. Jost. What makes you think you’re so popular? Self evaluation maintenance and the subjective side of the “friendship paradox”. Social Psychology Quarterly, 64: 207–223, 2001.

[8] M. E. J. Newman. Assortative mixing in networks. Phys. Rev. Lett., 89: 208701, 2002.

[9] M. E. J. Newman. Mixing patterns in networks. Phys. Rev. E, 67: 026126, 2003.

[10] S. Maslov, K. Sneppen, and A. Zaliznyak. Detection of topological pattern in complex networks: Correlation profile of the Internet. Physica A, 333: 529-540, 2004.

[11] M. Boguna, R. Pastor-Satorras, and A. Vespignani. Cut-offs and finite size effects in scale-free networks. Eur. Phys. J. B, 38: 205, 2004.

[12] M. E. J. Newman and Juyong Park. Why social networks are different from other types of networks. Phys. Rev. E, 68: 036122, 2003.

[13] M. McPherson, L. Smith-Lovin, and J. M. Cook. Birds of a feather: homophily in social networks. Annual Review of Sociology, 27:415-444, 2001.

[14] J. G. Foster, D. V. Foster, P. Grassberger, and M. Paczuski. Edge direction and the structure of networks. PNAS, 107: 10815, 2010.

[15] A. Barrat and R. Pastor-Satorras. Rate equation approach for correlations in growing network models. Phys. Rev. E, 71: 036127, 2005.

[16] S. N. Dorogovtsev and J. F. F. Mendes. Evolution of networks. Adv. Phys., 51: 1079, 2002.

[17] J. Berg and M. Lässig. Correlated random networks. Phys. Rev. Lett., 89: 228701, 2002.

[18] M. Boguñá and R. Pastor-Satorras. Class of correlated random networks with hidden variables. Phys. Rev. E, 68: 036112, 2003.

[19] R. Xulvi-Brunet and I. M. Sokolov. Reshuffling scale-free networks: From random to assortative. Phys. Rev. E, 70: 066102, 2004.

[20] R. Xulvi-Brunet and I. M. Sokolov. Changing correlations in networks: assortativity and dissortativity. Acta Phys. Pol. B, 36: 1431, 2005.

[21] J. Menche, A. Valleriani, and R. Lipowsky. Asymptotic properties of degree-correlated scale-free networks. Phys. Rev. E, 81: 046103, 2010.

[22] V. M. Eguíluz and K. Klemm. Epidemic threshold in structured scale-free networks. Phys. Rev. Lett., 89:108701, 2002.

[23] M. Boguñá and R. Pastor-Satorras. Epidemic spreading in correlate complex networks. Phys. Rev. E, 66: 047104, 2002.

[24] M. Boguñá, R. Pastor-Satorras, and A. Vespignani. Absence of epidemic threshold in scale-free networks with degree correlations. Phys. Rev. Lett., 90: 028701, 2003.

[25] A. Vázquez and Y. Moreno. Resilience to damage of graphs with degree correlations. Phys. Rev. E, 67: 015101R, 2003.

[26] S.J. Wang, A.C. Wu, Z.X. Wu, X.J. Xu, and Y.H. Wang. Response of degree-correlated scale-free networks to stimuli. Phys. Rev. E, 75: 046113, 2007.

[27] F. Sorrentino, M. Di Bernardo, G. Cuellar, and S. Boccaletti. Synchronization in weighted scale-free networks with degree–degree correlation. Physica D, 224: 123, 2006.

[28] M. Di Bernardo, F. Garofalo, and F. Sorrentino. Effects of degree correlation on the synchronization of networks of oscillators. Int. J. Bifurcation Chaos Appl. Sci. Eng., 17: 3499, 2007.

[29] A. Vazquez and M. Weigt. Computational complexity arising from degree correlations in networks. Phys. Rev. E, 67: 027101, 2003.

[30] M. Posfai, Y Y. Liu, J-J Slotine, and A.-L. Barabási. Effect of correlations on network controllability. Scientific Reports, 3: 1067, 2013.

[31] M. Weigt and A. K. Hartmann. The number of guards needed by a museum: A phase transition in vertex covering of random graphs. Phys. Rev. Lett., 84: 6118, 2000.

[32] S. Maslov and K. Sneppen. Specificity and stability in topology of protein networks. Science, 296: 910–913, 2002.

[33] L. Adamic and N. Glance. The political blogosphere and the 2004 U.S. election: Divided they blog (2005).

[34] J. Park and M. E. J. Newman. The origin of degree correlations in the Internet and other networks. Phys. Rev. E, 66: 026112, 2003.

[35] F. Chung and L. Lu. Connected components in random graphs with given expected degree sequences. Annals of Combinatorics, 6: 125, 2002.

[36] Z. Burda and Z. Krzywicki. Uncorrelated random networks. Phys. Rev. E, 67: 046118, 2003.