Section 7.1
Introduction

Angelina Jolie and Brad Pitt, Ben Affleck and Jennifer Garner, Harrison Ford and Calista Flockhart, Michael Douglas and Catherine Zeta-Jones, Tom Cruise and Katie Holmes, Richard Gere and Cindy Crawford (Image 7.1). An odd list, yet instantly recognizable to those immersed in the headline-driven world of celebrity couples. They are Hollywood stars that are or were married. Their weddings (and breakups) has drawn countless hours of media coverage and sold millions of gossip magazines. Thanks to them we take for granted that celebrities marry each other. We rarely pause to ask: Is this normal? In other words, what is the true chance that a celebrity marries another celebrity?

Hubs Dating Hubs. — Image 7.1
Hubs Dating Hubs
Celebrity couples, representing a highly visible proof that in social networks hubs tend to know, date and marry each other (Images from http://www.whosdatedwho.com).

Assuming that a celebrity could date anyone from a pool of about a hundred million (10⁸) eligible individuals worldwide, the chances that their mate would be another celebrity from a generous list of 1,000 other celebrities is only 10^-5. Therefore, if dating were driven by random encounters, celebrities would never marry each other.

Even if we do not care about the dating habits of celebrities, we must pause and explore what this phenomenon tells us about the structure of the social network. Celebrities, political leaders, and CEOs of major corporations tend to know an exceptionally large number of individuals and are known by even more. They are hubs. Hence celebrity dating (Image 7.1) and joint board memberships are manifestations of an interesting property of social network: hubs tend to have ties to other hubs.

As obvious this may sound, this property is not present in all networks. Consider for example the protein-interaction network of yeast, shown in Image 7.2. A quick inspection of the network reveals its scale-free nature: numerous one- and two-degree proteins coexist with a few highly connected hubs. These hubs, however, tend to avoid linking to each other. They link instead to many small-degree nodes, generating a hub-and-spoke pattern. This is particularly obvious for the two hubs highlighted in Image 7.2: they almost exclusively interact with small-degree proteins.

Hubs Avoiding Hubs. — Image 7.2
Hubs Avoiding Hubs

The protein interaction map of yeast. Each node corresponds to a protein and two proteins are linked if there is experimental evidence that they can bind to each other in the cell. We highlighted the two largest hubs, with degrees k = 56 and k′ = 13. They both connect to many small degree nodes and avoid linking to each other.

The network has N = 1,870 proteins and L = 2,277 links, representing one of the earliest protein interaction maps [1, 2]. Only the largest component is shown. Note that the protein interaction network of yeast in table 4.1 represents a later map, hence it contains more nodes and links than the network shown in this figure. Node color corresponds to the essentiality of each protein: the removal of the red nodes kills the organism, hence they are called lethal or essential proteins. In contrast the organism can survive without one of its green nodes. After [3].

A brief calculation illustrates how unusual this pattern is. Let us assume that each node chooses randomly the nodes it connects to. Therefore the probability that nodes with degrees k and k′ link to each other is

$p_{k,k'} = \frac{{kk'}}{{2L}} \hspace{20 mm} (7 . 1)$

Equation (7.1) tells us that hubs, by the virtue of the many links they have, are much more likely to connect to each other than to small degree nodes. Indeed, if k and k′ are large, so is p_k,k’ . Consequently, the likelihood that hubs with degrees k=56 and k’ = 13 have a direct link between them is p_k,k’ = 0.16, which is 400 times larger than p_1,2 = 0.0004, the likelihood that a degree-two node links to a degree-one node. Yet, there are no direct links between the hubs in Image 7.2, but we observe numerous direct links between small degree nodes.

Instead of linking to each other, the hubs highlighted in Image 7.2 almost exclusively connect to degree one nodes. By itself this is not unexpected: We expect that a hub with degree k = 56 should link to N₁p_{1, 56} ≈ 12 nodes with k = 1. The problem is that this hub connects to 46 degree one neighbors, i.e. four times the expected number.

In summary, while in social networks hubs tend to “date” each other, in the protein interaction network the opposite is true: The hubs avoid linking to other hubs, connecting instead to many small degree nodes. While it is dangerous to derive generic principles from two examples, the purpose of this chapter is to show that these patterns are manifestations of a general property of real networks: they exhibit a phenomena called degree correlations. We discuss how to measure degree correlations and explore their impact on the network topology.

Section 7.2
Assortativity and Disassortativity

Just by the virtue of the many links they have, hubs are expected to link to each other. In some networks they do, in others they don’t. This is illustrated in Image 7.3, that shows three networks with identical degree sequences but different topologies:

Neutral Network
Image 7.3b shows a network whose wiring is random. We call this network neutral, meaning that the number of links between the hubs coincides with what we expect by chance, as predicted by (7.1).
Assortative Network
The network of Image 7.3a has precisely the same degree sequence as the one in Image 7.3b. Yet, the hubs in Image 7.3a tend to link to each other and avoid linking to small-degree nodes. At the same time the small-degree nodes tend to connect to other small-degree nodes. Networks displaying such trends are assortative. An extreme manifestation of this pattern is a perfectly assortative network, in which each degree-k node connects only to other degree-k nodes (Image 7.4).
Disassortative Network
In Image 7.3c the hubs avoid each other, linking instead to small-degree nodes. Consequently the network displays a hub and-spoke character, making it disassortative.

In general a network displays degree correlations if the number of links between the high and low-degree nodes is systematically different from what is expected by chance. In other words, the number of links between nodes of degrees k and k′ deviates from (7.1).

The information about potential degree correlations is captured by the degree correlation matrix, e_ij, which is the probability of finding a node with degrees i and j at the two ends of a randomly selected link. As e_ij is a probability, it is normalized, i.e.

$\sum\limits_{i,j} {e_{ij} } = 1 \hspace{20 mm} (7 . 2)$

In (5.27) we derived the probability q_k that there is a degree-k node at the end of the randomly selected link, obtaining

$q_k = \frac{{kp_k }}{{\left\langle k \right\rangle }} \hspace{20 mm} (7 . 3)$

We can connect q_k to e_ij via

$\sum\limits_j {e_{ij} = q_i } \hspace{20 mm} (7 . 4)$

In neutral networks, we expect

$e_{ij} = q_i q_j \hspace{20 mm} (7 . 5)$

A network displays degree correlations if e_ij deviates from the random expectation (7.5). Note that (7.2) - (7.5) are valid for networks with an arbitrary degree distribution, hence they apply to both random and scale-free networks.

Perfect Assortativity. — Image 7.4
Perfect Assortativity
In a perfectly assortative network each node links only to nodes with the same degree. Hence *e_jk* = *δ_jkq_k*, where *δ_jk* is the Kronecker delta. In this case all non-diagonal elements of the *e_jk* matrix are zero. The figure shows such a perfectly assortative network, consisting of complete k-cliques.

Given that e_ij encodes all information about potential degree correlations, we start with its visual inspection. Figures 7.3d,e,f show e_ij for an assortative, a neutral and a disassortative network. In a neutral network small and high-degree nodes connect to each other randomly, hence e_ij lacks any trend (Image 7.3e). In contrast, assortative networks show high correlations along the main diagonal, indicating that nodes predominantly connect to nodes with comparable degree. Therefore low-degree nodes tend to link to other low-degree nodes and hubs to hubs (Image 7.3d). In disassortative networks e_ij displays the opposite trend: it has high correlations along the secondary diagonal. Therefore high-degree nodes tend to connect to low-degree nodes (Image 7.3f).

In summary information about degree correlations is carried by the degree correlation matrix e_ij. Yet, the study of degree correlations through the inspection of e_ij has numerous disadvantages:

It is difficult to extract information from the visual inspection of a matrix.
Unable to infer the magnitude of the correlations, it is difficult to compare networks with different correlations.
e_jk contains approximately k_max²/2 independent variables, representing a huge amount of information that is difficult to model in analytical calculations and simulations.

We therefore need to develop a more compact way to detect degree correlations. This is the goal of the subsequent sections.

Section 7.3
Measuring Degree Correlations

While e_ij contains the complete information about the degree correlations characterizing a particular network, it is difficult to interpret its content. In this section is to introduce the degree correlation function that offers a simpler way to quantify degree correlations.

Degree correlations capture the relationship between the degrees of nodes that link to each other. One way to quantify their magnitude is to measure for each node i the average degree of its neighbors (Image 7.5)

$k_{nn} (k_i ) = \frac{1}{{k_i }}\sum\limits_{j = 1}^N {A_{ij} k_j } \hspace{20 mm} (7 . 6)$

The degree correlation function calculates (7.6) for all nodes with degree k [4, 5]

$k_{nn} (k) = \sum\limits_{k'} {k'} P(k'|k) \hspace{20 mm} (7 . 7)$

where P(k’|k) is the conditional probability that following a link of a k-degree node we reach a degree-k' node. Therefore k_nn(k) is the average degree of the neighbors of all degree-k nodes.To quantify degree correlations we inspect the dependence of k_nn(k) on k.

Neutral Network
For a neutral network (7.3)-(7.5) predict $P(k'|k) = \frac{{e_{kk'} }}{{\sum\limits_{k'} {e_{kk'} } }} = \frac{{e_{kk'} }}{{q_k }} = \frac{{q_{k'} q_k }}{{q_k }} = q_{k'} \ hspace{20 mm} (7 . 8)$ This allows us to express k_nn(k) as $k_{nn} (k) = \sum\limits_{k'} {k'q_{k'} = \sum\limits_{k'} {k'\frac{{k'p(k')}}{{\left\langle k \right\rangle }} = \frac{{\left\langle {k^2 } \right\rangle }}{k}} } \hspace{20 mm} (7 . 9)$ Therefore, in a neutral network the average degree of a node’s neighbors is independent of the node’s degree k and depends only on the global network characteristics 〈k〉 and 〈k²〉. So plotting k_nn(k) in function of k should result in a horizontal line at 〈k²〉 / 〈k〉, as observed for the power grid (Image 7.6b). Equation (7.9) also captures an intriguing property of real networks: our friends are more popular than we are, a phenomenon called the friendship paradox (BOX 7.1).
Assortative Network
In assortative networks hubs tend to connect to other hubs, hence the higher is the degree k of a node, the higher is the average degree of its nearest neighbors. Consequently for assortative networks k_nn(k) increases with k, as observed for scientific collaboration networks (Image 7.6a).
Disassortative Network
In disassortative network hubs prefer to link to low-degree nodes. Consequently k_nn(k) decreases with k, as observed for the metabolic network (Image 7.6c).

The behavior observed in Image 7.6 prompts us to approximate the degree correlation function with [4]

$k_{nn} (k) = ak^\mu \hspace{20 mm} (7 . 10)$

If the scaling (7.10) holds, then the nature of degree correlations is determined by the sign of the correlation exponent μ:

Assortative Networks: μ > 0
A fit to k_nn(k) for the science collaboration network provides μ = 0.37 ± 0.11 (Image 7.6a).
Neutral Networks: μ = 0
According to (7.9) k_nn(k) is independent of k. Indeed, for the power grid we obtain μ = 0.04 ± 0.05, which is indistinguishable from zero (Image 7.6b).
Disassortative Networks: μ < 0
For the metabolic network we obtain μ = − 0.76 ± 0.04 (Image 7.6c).

Box 7.1
Friendship Paradox

The friendship paradox makes a suprising statement: On average my friends are more popular than I am [6,7]. This claim is rooted in (7.9), telling us that the average degree of a node’s neighbors is not simply 〈k〉, but depends on 〈k²〉 as well.

Consider a random network, for which 〈k²〉 = 〈k〉(1 + 〈k〉). According to (7.9) k_nn(k) = 1+〈k〉. Therefore the average degree of a node’s neighbors is always higher than the average degree of a randomly chosen node, which is 〈k〉.

The gap between 〈k〉 and our friends’ degree can be particularly large in scale-free networks, for which 〈k²〉 / 〈k〉 significantly exceeds 〈k〉 (Image 4.8). Consider for example the actor network, for which 〈k²〉 / 〈k〉 = 565 (Table 4.1). In this network the average degree of a node's friends is hundreds of times the degree of the node itself.

The friendship paradox has a simple origin: We are more likely to be friends with hubs than with small-degree nodes, simply because hubs have more friends than the small nodes.

In summary, the degree correlation function helps us capture the presence or absence of correlations in real networks. The k_nn(k) function also plays an important role in analytical calculations, allowing us to predict the impact of degree correlations on various network characteristics (SECTION 7.6). Yet, it is often convenient to use a single number to capture the magnitude of correlations present in a network. This can be achieved either through the correlation exponent μ defined in (7.10), or using the degree correlation coefficient introduced in BOX 7.2.

Box 7.2
Degree Correlation Coefficient

If we wish to characterize degree correlations using a single number, we can use either μ or the degree correlation coefficient. Proposed by Mark Newman [8,9], the degree correlation coefficient is defined as

$r = \sum\limits_{jk} {\frac{{jk(e_{jk} - q_j q_k )}}{{\sigma ^2 }}} \hspace{20 mm} (7 . 11)$

with

$\sigma ^2 = \sum\limits_k {k^2 q_k - \left[ {\sum\limits_k {kq_k } } \right]} ^2 \hspace{20 mm} (7 . 12)$

Hence r is the Pearson correlation coefficient between the degrees found at the two end of the same link. It varies between −1 ≤ r ≤ 1: For r < 0 the network is assortative, for r = 0 the network is neutral and for r > 0 the network is disassortative. For example, for the scientific collaboration network we obtain r = 0.13, in line with its assortative nature; for the protein interaction network r = −0.04, supporting its disassortative nature and for the power grid we have r = 0.

The assumption behind the degree correlation coefficient is that k_nn(k) depends linearly on k with slope r. In contrast the correlation exponent μ assumes that k_nn(k) follows the power law (7.10). Naturally, both cannot be valid simultaneously. The analytical models of SECTION 7.7 offer some guidance, supporting the validity of (7.10). As we show in ADVANCED TOPICS 7.A, in general r correlates with μ.

Section 7.4
Structural Cutoffs

Throughout this book we assumed that networks are simple, meaning that there is at most one link between two nodes (Figure 2.17). For example, in the email network we place a single link between two individuals that are in email contact, despite the fact that they may have exchanged multiple messages. Similarly, in the actor network we connect two actors with a single link if they acted in the same movie, independent of the number of joint movies. All datasets discussed in Table 4.1 are simple networks.

In simple networks there is a puzzling conflict between the scale-free property and degree correlations [10, 11]. Consider for example the scalefree network of Image 7.7a, whose two largest hubs have degrees k = 55 and k' = 46. In a network with degree correlations e_kk' the expected number of links between k and k' is

$E_{kk'} = e_{kk'} \left\langle k \right\rangle N \hspace{20 mm} (7 . 13)$

For a neutral network e_kk' is given by (7.5), which, using (7.3), predicts

$E_{kk'} = \frac{{kp_k k'p_{k'} }}{{\left\langle k \right\rangle }}N = \frac{{\frac{{55}}{{300}}\frac{{46}}{{300}}}}{3}300 = 2.8 \hspace{20 mm} (7 . 14)$

Therefore, given the size of these two hubs, they should be connected to each other by two to three links to comply with the network’s neutral nature. Yet, in a simple network we can have only one link between them, causing a conflict between degree correlations and the scale-free property. The goal of this section is to understand the origin and the consequences of this conflict.

For small k and k' (7.14) predicts that E_kk’ is also small, i.e. we expect less than one link between the two nodes. Only for nodes whose degree exceeds some threshold k_s does (7.14) predict multiple links. As we show in ADVANCED TOPICS 7.B, k_s, called structural cutoff, scales as

In other words, nodes whose degree exceeds (7.15) have E_kk’ > 1, a conflict that as we show below gives rise to degree correlations.

$k_s (N) \sim \left( {\left\langle k \right\rangle N} \right)^{1/2} \hspace{20 mm} (7 . 15)$

Structural Disassortativity. — Image 7.7
Structural Disassortativity

A scale-free network with N=300, L=450, and γ=2.2, generated by the configuration model (Figure 4.15). By forbidding self-loops and multi-links, we made the network *simple*. We highlight the two largest nodes in the network. As (7.14) predicts, to maintain the network’s neutral nature, we need approximately three links between these two nodes. The fact that we do not allow multilinks (simple network representation) makes the network disassortative, a phenomena called *structural disassortativity*.

To illustrate the origins of structural correlations we start from a fixed degree sequence, shown as individual stubs on the left. Next we randomly connect the stubs (configuration model). In this case the expected number of links between the nodes with degree 8 and 7 is 8x7/28 ≈ 2. Yet, if we do not allow multi-links, there can only be one link between these two nodes, making the network structurally disassortative.

In other words, nodes whose degree exceeds (7.15) have E_kk’ > 1, a conflict that as we show below gives rise to degree correlations.

To understand the consequences of the structural cutoff we must first ask if a network has nodes whose degrees exceeds (7.15). For this we compare the structural cutoff, k_s, with the natural cutoff, k_max, which is the expected largest degree in a network. According to (4.18), for a scale-free network k_max ~ N^1/γ-1 . Comparing k_max to k_s allows us to distinguish two regimes:

No Stuctural Cutoff
For random networks and scale-free networks with γ ≥ 3 the exponent of k_max is smaller than 1/2, hence k_max is always smaller than k_s. In other words the node size at which the structural cutoff turns on exceeds the size of the biggest hub. Consequently we have no nodes for which E_kk’ > 1. For these networks we do not have a conflict between degree correlations and the simple network requirement.
Stuctural Disassortativity
For scale-fee networks with γ < 3 we have 1/(γ-1) > 1/2, i.e. k_s can be smaller than k_max. Consequently nodes whose degree is between k_s and k_max can violate E_kk’ > 1. In other words the network has fewer links between its hubs than (7.14) would predict. These networks will therefore become disassortative, a phenomenon we call structural disassortativity. This is illustrated in Image 7.8a,b that show a simple scale-free network generated by the configuration model. The network shows disassortative scaling, despite the fact that we did not impose degree correlations during its construction.

Natural and Structural Cutoffs. — Image 7.8
Natural and Structural Cutoffs
The figure illustrates the tension between the scale-free property and degree correlations. We show the degree distribution (left panels) and the degree correlation function *k_nn(k)* (right panels) of a scale-free network with N = 10,000 and γ = 2.5, generated by the configuration model (Image 4.15).

**(a,b)** If we generate a scale-free network with the power-law degree distribution shown in (a), and we forbid self-loops and multilinks, the network displays structural disassortativity, as indicated by *k_nn(k)* in (b). In this case, we lack a sufficient number of links between the high-degree nodes to maintain the neutral nature of the network, hence for high k the *k_nn(k)* function must decay.

**(c,d)** We can eliminate structural disassortativity by relaxing the simple network requirement, i.e. allowing multiple links between two nodes. As shown in (c,d), in this case we obtain a neutral scale-free network.

**(e,f)** If we impose an upper cutoff by removing all nodes with k ≥ *k_s* ≃ 100, as predicted by (7.15), the network becomes neutral, as seen in (f).

We have two avenues to generate networks that are free of structural disassortativity:

We can relax the simple network requirement, allowing multiple links between the nodes. The conflict disappears and the network will be neutral (Image 7.8c,d).
If we insist having a simple scale-free network that is neutral or assortative, we must remove all hubs with degrees larger than k_s. This is illustrated in Image 7.8e,f: a network that lacks nodes with k ≥ 100 is neutral.

Finally, how can we decide whether the correlations observed in a particular network are a consequence of structural disassortativity, or are generated by some unknown process that leads to degree correlations? Degree-preserving randomization (Image 4.17) helps us distinguish these two possibilities:

Degree Preserving Randomization with Simple Links (R-S)
We apply degree-preserving randomization to the original network and at each step we make sure that we do not permit more than one link between a pair of nodes. On the algorithmic side this means that each rewiring that generates multi-links is discarded. If the real k_nn(k) and the randomized k_nn^R−S(k) are indistinguishable, then the correlations observed in a real system are all structural, fully explained by the degree distribution. If the randomized knn k_nn^R−S(k) does not show degree correlations while k_nn(k) does, there is some unknown process that generates the observed degree correlations.
Degree Preserving Randomization with Multiple Links (R-M)
For a self-consistency check it is sometimes useful to perform degree-preserving randomization that allows for multiple links between the nodes. On the algorithmic side this means that we allow each random rewiring, even if it leads to multi-links. This process eliminates all degree correlations.

We performed the randomizations discussed above for three real networks. As Image 7.9a shows, the assortative nature of the scientific collaboration network disappears under both randomizations. This indicates that the assortative correlations of the collaboration network is not linked to its scale-free nature. In contrast, for the metabolic network the observed disassortativity remains unchanged under R-S (Image 7.9c). Consequently the disassortativity of the metabolic network is structural, being induced by its degree distribution.

Randomization and Degree Correlations. — Image 7.9
Randomization and Degree Correlations
To uncover the origin of the observed degree correlations, we must compare *k_nn(k)* (grey symbols), with *k_nn^R-S(k)* and *k_nn^R-M(k)* obtained after degree-preserving randomization. Two degree- preserving randomizations are informative in this context:

**Randomization with Simple Links (R-S):**
At each step of the randomization process we check that we do not have more than one link between any node pairs.

**Randomization with Multiple Links (R-M):**
We allow multi-links during the randomization processes.

We performed these two randomizations for the networks of Image 7.6. The *R-M* procedure always generates a neutral network, consequently *k_nn^R-M(k)* is always horizontal. The true insight is obtained when we compare *k_nn(k)* with *k_nn^R-S(k)*, helping us to decide if the observed correlations are structural:

**Scientific Collaboration Network**
The increasing *k_nn(k)* differs from the horizontal *k_nn^R-S(k)*, indicating that the network’s assortativity is not structural. Consequently the assortativity is generated by some process that governs the network’s evolution. This is not unexpected: structural effects can generate only disassortativity, not assortativity.

**Power Grid**
The horizontal *k_nn(k)*, *k_nn^R-S(k)* and *k_nn^R-M(k)* all support the lack of degree correlations (neutral network).

**Metabolic Network**
As both *k_nn(k)* and *k_nn^R-S(k)* decrease, we conclude that the network’s disassortativity is induced by its scale-free property. Hence the observed degree correlations are structural.

In summary, the scale-free property can induce disassortativity in simple networks. Indeed, in neutral or assortative networks we expect multiple links between the hubs. If multiple links are forbidden (simple graph), the network will display disassortative tendencies. This conflict vanishes for scale-free networks with γ ≥ 3 and for random networks. It also vanishes if we allow multiple links between the nodes.

Section 7.5
Correlations in Real Networks

To understand the prevalence of degree correlations we need to inspect the correlations characterizing real networks. In Image 7.10 we show the k_nn(k) function for the ten reference networks, observing several patterns:

Power Grid
For the power grid k_nn(k) is flat and indistinguishable from its randomized version, indicating a lack of degree correlations (Image 7.10a). Hence the power grid is neutral.
Internet
For small degrees (k ≤ 30) k_nn(k) shows a clear assortative trend, an effect that levels off for high degrees (Image 7.10b). The degree correlations vanish in the randomized version of the Internet map. Hence the Internet is assortative, but structural cutoffs eliminate the effect for high k.
Social Networks
The three networks capturing social interactions, the mobile phone network, the science collaboration network and the actor network, all have an increasing knn(k), indicating that they are assortative (Image 7.10c-e). Hence in these networks hubs tend to link to other hubs and low-degree nodes tend to link to low-degree nodes. The fact that the observed k_nn(k) differs from the k_nn^R-S(k), indicates that the assortative nature of social networks is not due to their scale-free the degree distribution.
Email Network
While the email network is often seen as a social network, its k_nn(k) decreases with k, documenting a clear disassortative behavior (Imag 7.10f). The randomized k_nn^R-S(k) also decays, indicating that we are observing structural disassortativity, a consequence of the network’s scale-free nature.
Biological Networks
The protein interaction and the metabolic network both have a negative μ, suggesting that these networks are disassortative. Yet, the scaling of k_nn^R-S(k) is indistinguishable from k_nn(k), indicating that we are observing structural disassortativity, rooted in the scale-free nature of these networks (Image 7.10 g,h).
WWW
The decaying k_nn(k) implies disassortative correlations (Image 7.10i). The randomized k_nn^R-S(k) also decays, but not as rapidly as k_nn(k). Hence the disassortative nature of the WWW is not fully explained by its degree distribution.
Citation Network
This network displays a puzzling behavior: for k ≤ 20 the degree correlation function k_nn(k) shows a clear assortative trend; for k > 20, however, we observe disassortative scaling (Image 7.10j). Such mixed behavior can emerge in networks that display extreme assortativity (Image 7.13b). This suggests that the citation network is strongly assortative, but its scale-free nature induces structural disassortativity, changing the slope of k_nn(k) for k ≫ k_s.

In summary, Image 7.10 indicates that to understand degree correlations, we must always compare k_nn(k) to the degree randomized k_nn^R-S(k). It also allows us to draw some interesting conclusions:

Of the ten reference networks the power grid is the only truly neutral network. Hence most real networks display degree correlations.
All networks that display disassortative tendencies (email, protein, metabolic) do so thanks to their scale-free property. Hence, these are all structurally disassortative. Only the WWW shows disassortative correlations that are only partially explained by its degree distribution.
The degree correlations characterizing assortative networks are not explained by their degree distribution. Most social networks (mobile phone calls, scientific collaboration, actor network) are in this class and so is the Internet and the citation network

A number of mechanisms have been proposed to explain the origin of the observed assortativity. For example, the tendency of individuals to form communities, the topic of CHAPTER 9, can induce assortative correlations [12]. Similarly, the society has endless mechanisms, from professional committees to TV shows, to bring hubs together, enhancing the assortative nature of social and professional networks. Finally, homophily, a well documented social phenomena [13], indicates that individuals tend to associate with other individuals of similar background and characteristics, hence individuals with comparable degree tend to know each other. This degree-homophily may be responsible for the celebrity marriages as well (Image 7.1).

Box 7.3
Correlations in Directed Networks

The degree correlation function (7.7) is defined for undirected networks. To measure correlations in directed networks we must take into account that each node i is characterized by an incoming k_iⁱⁿ and an outgoing k_i^out degree [14]. We therefore define four degree correlation functions, k_nn^α,β(k), where α and β refer to the in and out indices (Image 7.11 a-d). In Image 7.11e we show em>k_nn^α,β(k) for citation networks, indicating a lack of in-out correlations and the presence of assortativity for small k for the other three correlations (in-in, out-in, out-out).

Image 7.11
Correlations in Directed Network

(a)-(d) The four possible correlations characterizing a directed network. We show in purple and green the (α, β) indices that define the appropriate correlation function [14]. For example, (a) describes the k_nn^in,in(k) correlations between the in-degrees of two nodes connected by a link.

(e) The k_nn^α,β(k) correlation function for citation networks, a directed network. For example k_nn^in,in(k) is the average indegree of the in-neighbors of nodes with in-degree k_in. These functions show a clear assortative tendency for three of the four functions up to degree k ≃ 100. The empty symbols capture the degree randomized k_nn^α,β(k) for each degree correlation function (R-S randomization).

Section 7.6
Generating Correlated Networks

To explore the impact of degree correlations on various network characteristics we must first understand the correlations characterizing the network models discussed thus far. It is equally important to develop algorithm that can generate networks with tunable correlations. As we show in this section, given the conflict between the scale-free property and degree correlations, this is not a trivial task.

Degree Correlations in Static Models

Erdős-Rényi Model
The random network model is neutral by definition. As it lacks hubs, it does not develop structural correlations either. Hence for the Erdős-Rényi network k_nn(k) is given by (7.9), predicting μ = 0 for any 〈k〉 and N.

Configuration Model
The configuration model (Image 4.15) is also neutral, independent of the choice of the degree distribution p_k. This is because the model allows for both multi-links and self-loops. Consequently, any conflicts caused by the hubs are resolved by the multiple links between them. If, however, we force the network to be simple, then the generated network will develop structural disassortativity (Image 7.8).

Hidden Parameter Model
In the model e_jk is proportional to the product of the randomly chosen hidden variables η_j and η_k (Image 4.18). Consequently the network is technically uncorrelated. However, if we do not allow multi-links, for scalefree networks we again observe structural disassortativity. Analytical calculations indicate that in this case [18]

$k_{nn} (k) \sim k^{ - 1} \hspace{20 mm} (7 . 16)$

i.e. the degree correlation function follows (7.10) with μ = − 1.

Taken together, the static models explored so far generate either neutral networks, or networks characterized by structural disassortativity following (7.16).

Degree Correlations in Evolving Networks

To understand the emergence (or the absence) of degree correlations in growing networks, we start with the initial attractiveness model (SECTION 6.5), which includes as a special case the Barabási-Albert model

Initial Attractiveness Model
Consider a growing network in which preferential attachment follows (6.23), i.e. Π(k) ~ A + k, where A is the initial attractiveness. Depending on the value of A, we observe three distinct scaling regimes [15]:

Disassortative Regime: γ < 3
If − m < A < 0 we have $k_{nn} (k) \sim m\frac{{(m + A)^{1 - \frac{A}{m}} }}{{2m + A}}\zeta \left( {\frac{{2m}}{{2m + A}}} \right)N^{\frac{A}{{2m + A}}} k^{\frac{A}{m}} \hspace{20 mm} (7 . 17)$ Hence the resulting network is disassortative, k_nn(k) decaying following the power-law [15, 16] $k_{nn} (k) \sim k^{\frac{{|A|}}{m}} \hspace{20 mm} (7 . 18)$
Neutral Regime: γ = 3
If A = 0 the initial attractiveness model reduces to the Barabási-Albert model. In this case $k_{nn} (k) \sim \frac{m}{2}\ln N \hspace{20 mm} (7 . 19)$ Consequently k_nn(k) is independent of k, hence the network is neutral.
Weak Assortativity: γ > 3
If A > 0 the calculations predict $k_{nn} (k) \approx (m + A)\ln \left( {\frac{k}{{m + A}}} \right) \hspace{20 mm} (7 . 20)$ As k_nn(k) increases logarithmically with k, the resulting network displays a weak assortative tendency, but does not follow (7.10).

In summary, (7.17) - (7.20) indicate that the initial attractiveness model generates rather complex degree correlations, from disassortativity to weak assortativity. Equation (7.19) also shows that the network generated by the Barabási-Albert model is neutral. Finally, (7.17) predicts a power law k-dependence for k_nn(k), offering analytical support for the empirical scaling (7.10).

Bianconi-Barabási Model
With a uniform fitness distribution the Bianconi-Barabási model generates a disassortative network [5] (Image 7.12). The fact that the randomized version of the network is also disassortative indicates that the model's disassortativity is structural. Note, however, that the real k_nn(k) and the randomized k_nn^R-S(k) do not overlap, indicating that the disassortativity of the model is not fully explained by its scale-free nature.

Correlations in the Bianconi-Barabási Model. — Image 7.12
Correlations in the Bianconi-Barabási Model
The degree correlation function of the Bianconi-Barabási model for N = 10,000, m = 3 and uniform fitness distribution (SECTION 6.2). As the green dotted line indicates, follwing (7.10) indicates, the network is disassortative, consistent with μ ≃ -0.5. The orange symbols correspond to *k_nn^R-S(k)*. As *k_nn^R-S(k)* also decreases, the bulk of the observed disassortativity is structural. But the difference between *k_nn^R-S(k)* and *k_nn(k)* suggests that structural effects cannot fully account for the observed degree correlation.

Tuning Degree Correlations

Several algorithms can generate networks with desired degree correlations [8, 17, 18]. Next we discuss a simplified version of the algorithm proposed by Xalvi-Brunet and Sokolov that aims to generate maximally correlated networks with a predefined degree sequence [19, 20, 21]. It consists of the following steps (Image 7.13a):

Step 1: Link Selection
Choose at random two links. Label the four nodes at the end of these two links with a, b, c, and d such that their degrees are ordered as $k_a \ge k_b \ge k_c \ge k_d$
Step 2: Rewiring
Break the selected links and rewire them to form new pairs. Depending on the desired degree correlations the rewiring is done in two ways:
- Step 2A: Assortative
  By pairing the two highest degree nodes (a with b) and the two lowest degree nodes (c/ with d), we connect nodes with comparable degrees, enhancing the network’s assortative nature.
- Step 2B: Disassortative
  By pairing the highest and the lowest degree nodes (a with d and b with c), we connect nodes with different degrees, enhancing the network’s disassortative nature.

Xulvi-Brunet & Sokolov Algorithm. — Image 7.13
Xulvi-Brunet & Sokolov Algorithm

The algorithm generates networks with *maximal degree correlations*.

**(a)** The basic steps of the algorithm.

**(b)** *k_nn(k)* for networks generated by the algorithm for a scale-free network with N = 1,000, L = 2,500, γ = 3.0.

**(c, d)** A typical network configuration and the corresponding *A_ij* matrix for the maximally assortative network generated by the algorithm, where the rows and columns of *A_ij* were ordered according to increasing node degrees k.

**(e,f)** Same as in (c,d) for a maximally disassortative network.

The *A_ij* matrices (d) and (f) capture the inner regularity of networks with maximal correlations, consisting of blocks of nodes that connect to nodes with similar degree in (d) and of blocks of nodes that connect to nodes with rather different degrees in (f).

By iterating these steps we gradually enhance the network’s assortative (Step 2A) or disassortative (Step 2B) features. If we aim to generate a simple network (free of multi-links), after Step 2 we check whether the particular rewiring leads to multi-links. If it does, we reject it, returning to Step 1.

The correlations characterizing the networks generated by this algorithm converge to the maximal (assortative) or minimal (disassortative) value that we can reach for the given degree sequence (Image 7.13b). The model has no difficulty creating disassortative correlations (Image 7.13e,f). In the assortative limit simple networks display a mixed k_nn(k): assortative for small k and disassortative for high k (Image 7.13b). This is a consequence of structural cutoffs: For scale-free networks the system is unable to sustain assortativity for high k. The observed behavior is reminiscent of the k_nn(k) function of citation networks (Image 7.10j).

The version of the Xalvi-Brunet & Sokolov algorithm introduced in Image 7.13 generates maximally assortative or disassortative networks. We can tune the magnitude of the generated degree correlations if we use the algorithm discussed in Image 7.14.

In summary, static models, like the configuration or hidden parameter model, are neutral if we allow multi-links, and develop structural disassortativity if we force them to generate simple networks. To generate networks with tunable correlations, we can use for example the Xalve-Brunet & Sokolov algorithm. An important result of this section is (7.16) and (7.18), offering the analytical form of the degree correlation function for the hidden paramenter model and for a growing network, in both case predicting a power-law k-dependence. These results offer analytical backing for the scaling hypothesis (7.10), indicating that both structural and dynamical effects can result in a degree correlation function that follows a power law.

Tuning Degree Correlations. — Image 7.14
Tuning Degree Correlations

We can use the Xalvi-Brunet & Sokolov algorithm to tune the magnitude of degree correlations.

We execute the deterministic rewiring step with probability p, and with probability 1 − p we randomly pair the a, b, c, d nodes with each other. For p = 1 we are back to the algorithm of Image 7.13, generating maximal degree correlations; for p < 1 the induced noise tunes the magnitude of the effect.

Typical network configurations generated for p = 0.5.

The *k_nn(k)* functions for various p values for a network with N = 10,000, *〈k〉* = 1, and γ = 3.0.

Note that the correlation exponent μ depends on the fitting region, especially in the assortative case.

Section 7.7
The Impact of Degree Correlations

As we have seen in Image 7.10, most real networks are characterized by some degree correlations. Social networks are assortative; biological networks display structural disassortativity. These correlations raise an important question: Why do we care? In other words, do degree correlations alter the properties of a network? And which network properties do they influence? This section addresses these important questions.

An important property of a random network is the emergence of a phase transition at 〈k〉 = 1, marking the appearance of the giant component (SECTION 3.6). Image 7.15 shows the relative size of the giant component for networks with different degree correlations, documenting several patterns [8, 19, 20]:

Assortative Networks
For assortative networks the phase transition point moves to a lower 〈k〉, hence a giant component emerges for 〈k〉 < 1. The reason is that it is easier to start a giant component if the high-degree nodes seek out each other.
Disassortative Networks
The phase transition is delayed in disassortative networks, as in these the hubs tend to connect to small degree nodes. Consequently, disassortative networks have difficulty forming a giant component.
Giant Component
For large 〈k〉 the giant component is smaller in assortative networks than in neutral or disassortative networks. Indeed, assortativity forces the hubs to link to each other, hence they fail to attract to the giant component the numerous small degree nodes.

Degree Correlations and the Phase Transition Point. — Image 7.15
Degree Correlations and the Phase Transition Point
Relative size of the giant component for an Erdős-Rényi network of size N=10,000 (green curve), which is then rewired using the Xalvi-Brunet & Sokolov algorithm with p = 0.5 to induce degree correlations (Image 7.14). The figure indicates that as we move from assortative to disassortative networks, the phase transition point is delayed and the size of the giant component increases for large *〈k〉*. Each point represents an average over 10 independent runs.

These changes in the size and the structure of the giant component have implications to the spread of diseases [22, 23, 24], the topic of CHAPTER 10. Indeed, as we have seen in Image 7.10, social networks tend to be assortative. The high degree nodes therefore form a giant component that acts as a “reservoir” for the disease, sustaining an epidemic even when on average the network is not sufficiently dense for the virus to persist.

The altered giant component has implications for network robustness as well [25]. As we discuss in CHAPTER 8, the removal of a network's hubs fragments a network. In assortative networks hub removal makes less damage because the hubs form a core group, hence many of them are redundant. Hub removal is more damaging in disassortative networks, as in these the hubs connect to many small-degree nodes, which fall off the network once a hub is deleted.

Let us mention a few additional consequences of degree correlations:

Image 7.16 shows the path-length distribution of a random network rewired to display different degree correlations. It indicates that in assortative networks the average path length is shorter than in neutral networks. The most dramatic difference is in the network diameter, d_max, which is significantly higher for assortative networks. Indeed, assortativity favors links between nodes with similar degree, resulting in long chains of k = 2 nodes, enhancing d_max (Image 7.13c).
Degree correlations influence a system’s stability against stimuli and perturbations [26] as well as the synchronization of oscillators placed on a network [27, 28].
Degree correlations have a fundamental impact on the vertex cover problem [29], a much-studied problem in graph theory that requires us to find the minimal set of nodes (cover) such that each link is connected to at least one node in the cover (BOX 7.4).
Degree correlations impact our ability to control a network, altering the number of input signals one needs to achieve full control [30].

Degree Correlations and Path Lengths. — Image 7.16
Degree Correlations and Path Lengths
Distance distribution for a random network with size N = 10, 000 and *〈k〉* = 3. Correlations are induced using the Xalvi-Brunet & Sokolov algorithm with p = 0.5 (Image 7.14). The plots show that as we move from disassortative to assortative networks, the average path length decreases, indicated by the gradual move of the peaks to the left. At the same time the diameter, *d_max*, grows. Each curve represents an average over 10 independent networks.

In summary, degree correlations are not only of academic interest, but they influence numerous network characteristics and have a discernable impact on many processes that take place on a network.

Image 7.17
The Minimum Cover

Formally, a vertex cover of a network is a set C of nodes such that each link of the network connects to at least one node in C. A minimum vertex cover is a vertex cover of smallest possible size. The figure above shows examples of minimum vertex covers in two small networks, where the set C is shown in purple. We can check that if we turn any of the purple nodes into green nodes, at least one link will not connect to a purple node.

Box 7.4
Vertex Cover and Museum Guards

Imagine that you are the director of an open-air museum located in a large park. You wish to place guards on the crossroads to observe each path. Yet, to save cost you want to use as few guards as possible. How many guards do you need?

Let N be the number of crossroads and m < N is the number of guards you can afford to hire. While there are (^N_m) ways of placing the m guards at N crossroads, most configurations leave some paths unsupervised [31].

The number of trials one needs to place the guards so that they cover all paths grows exponentially with N. Indeed, this is one of the six basic NP-complete problems, called the vertex cover problem. The vertex cover of a network is a set of nodes such that each link is connected to at least one node of the set (Image 7.17). NP-completeness means that there is no known algorithm which can identify a minimal vertex cover substantially faster than using as exhaustive search, i.e. checking each possible configuration individually. The number of nodes in the minimal a vertex cover depends on the network topology, being affected by the degree distribution and degree correlations [29].

Section 7.8
Summary

Degree correlations were first discovered in 2001 in the context of the Internet by Romualdo Pastor-Satorras, Alexei Vazquez, and Alessandro Vespignani [4, 5], who also introduced the degree correlation function k_nn(k) and the scaling (7.10). A year later Kim Sneppen and Sergey Maslov used the full p(k_i,k_j), related to the e_ij matrix, to characterize the degree correlations of protein interaction networks [32]. In 2003 Mark Newman introduced the degree correlation coefficient [8, 9] together with the assortative, neutral, and disassortative distinction. These terms have their roots in social sciences [13]:

Assortative mating reflects the tendency of individuals to date or marry individuals that are similar to them. For example, low-income individuals marry low-income individuals and college graduates marry college graduates. Network theory uses assortativity in the same spirit, capturing the degree-based similarities between nodes: In assortative networks hubs tend to connect to other hubs and small-degree nodes to other small-degree nodes. In a network environment we can also encounter the traditional assortativity, when nodes of similar properties link to each other (Image 7.18).

Disassortative mixing, when individuals link to individuals wo are unlike them, is also common in some social and economic systems. Sexual networks are perhaps the best example, as most sexual relationships are between individuals of different gender. In economic settings trade typically takes place between individuals of different skills: the baker does not sell bread to other bakers, and the shoemaker rarely fixes other shoemaker's shoes.

Taken together, there are several reasons why we care about degree correlations in networks (BOX 7.5):

Degree correlations are present in most real networks (SECTION 7.5).
Once present, degree correlations change a network’s behavior (SECTION 7.7).
Degree correlations force us to move beyond the degree distribution, representing quantifiable patters that govern the way nodes link to each other that are not captured by p_k alone.

Politics is Never Neutralr. — Image 7.18
Politics is Never Neutral
The network behind the US political blogosphere illustrates the presence of assortative mixing, as used in sociology, meaning that nodes of similar characteristics tend to link to each other. In the map each blue node corresponds to liberal blog and red nodes are conservative. Blue links connect liberal blogs, red links connect conservative blogs, yellow links go from liberal to conservative, and purple from conservative to liberal. As the image indicates, very few blogs link across the political divide, demonstrating the strong assortativity of the political blogosphere.
After [33].

Despite the considerable effort devoted to characterizing degree correlations, our understanding of the phenomena remains incomplete. For example, while in SECTION 7.6 we offered an algorithm to tune degree correlations, the problem is far from being fully resolved. Indeed, the most accurate description of a network's degree correlations is contained in the e_ij matrix. Generating networks with an arbitrary e_ij remains a difficult task.

Finally, in this chapter we focused on the k_nn(k) function, which captures two-point correlations. In principle higher order correlations are also present in some networks (BOX 7.6). The impact of such three or four point correlations remains to be understood.

Box 7.6
Two-Point, Three-Point Correlations

The complete degree correlations characterizing a network are determined by the conditional probability P(k⁽¹⁾, k⁽²⁾, ..., k^(k)|k) that a node with degree k connects to nodes with degrees k⁽¹⁾, k⁽²⁾, ..., k^(k).

Two-point Correlations
The simplest of these is the two-point correlation discussed in this chapter, being the conditional probability P(k’|k) that a node with degree k is connected to a node with degree k′. For uncorrelated networks this conditional probability is independent of k, i.e. P(k’| k) = k’p_k’ / 〈k〉 [18]. As the empirical evaluation of P(k′|k) in real networks is cumbersome, it is more practical to analyze the degree correlation function k_nn(k) defined in (7.7).

Three-point Correlations
Correlations involving three nodes are determined by P(k⁽¹⁾,k⁽²⁾|k). This conditional probability is connected to the clustering coefficient. Indeed, the average clustering coefficient C(k) [22, 23] can be formally written as the probability that a degree-k node is connected to nodes with degrees k⁽¹⁾ and k⁽²⁾, and that those two are joined by a link, averaged over all the possible values of k⁽¹⁾ and k⁽²⁾

$C(k) = \sum\limits_{k^{(1)} ,k^{(2)} } {P(k^{(1)} ,k^{(2)} |k)p_{k^{(1)} ,k^{(2)} }^k }$

where p^k_{_{k⁽¹⁾,k⁽²⁾}} is the probability that nodes k⁽¹⁾ and k⁽²⁾ are connected, provided that they have a common neighbor with degree k [18]. For neutral networks C(k) is independent of k, following

$C = \frac{{\left( {\left\langle {k^2 } \right\rangle - \left\langle k \right\rangle } \right)^2 }}{{\left\langle k \right\rangle ^3 N}}$

Section 7.9
Homework

Detailed Balance for Degree Correlations
Express the joint probability e_kk', the conditional probability P(k'|k) and the probability q_k, discussed in this chapter, in terms of number of nodes N, average degree 〈k〉, number of nodes with degree k, N_k, and the number of links connecting nodes of degree k and k', E_kk' (note that E_kk' is twice the number of links when k = k'). Based on these expressions, show that for any network we have
$e_{kk'} = q_k P(k'|k)$
Star Network
Consider a star network, where a single node is connected to N – 1 degree one nodes. Assume that N≫1.
1. What is the degree distribution p_k of this network?
2. What is the probability q_k that moving along a randomly chosen link we find at its end a node with degree k?
3. Calculate the degree correlation coefficient r for this network. Use the expressions of e_kk' and P(k'|k) calculated in HOMEWORK 7.1.
4. Is this network assortative or disassortative? Explain why.
Structural Cutoffs
Calculate the structural cutoff k_s for the undirected networks listed in Table 4.1. Based on the plots in Image 7.10, predict for each network whether k_s is larger or smaller than the maximum expected degree k_max. Confirm your prediction by calculating k_max.
Degree Correlations in Erdős-Rényi Networks
Consider the Erdős-Rényi G(N,L) model of random networks, introduced in CHAPTER 2 (BOX 3.1 and SECTION 3.2), where N labeled nodes are connected with L randomly placed links. In this model, the probability that there is a link connecting nodes i and j depends on the existence of a link between nodes l and s.
1. Write the probability that there is a link between i and j, e_ij and the probability that there is a link between i and j conditional on the existence of a link between l and s.
2. What is the ratio of such two probabilities for small networks? And for large networks?
3. What do you obtain for the quantities discussed in (a) and (b) if you use the Erdős-Rényi G(N,p) model?
Based on the results found for (a)-(c) discuss the implications of using the G(N,L) model instead of the G(N,p) model for generating random networks with small number of nodes.

Section 7.10
Advanced Topic 7.A
Degree Correlation Coefficient

In BOX 7.2 we defined the degree correlation coefficient r as an alternative measure of degree correlations [8, 9]. The use of a single number to characterize degree correlations is attractive, as it offers a way to compare the correlations observed in networks of different nature and size. Yet, to effectively use r we must be aware of its origin.

The hypothesis behind the correlation coefficient r implies that the k_nn(k) function can be approximated by the linear function

$k_{nn} (k) \sim rk \hspace{20 mm} (7 . 21)$

This is different from the scaling (7.10), which assumes a power law dependence on k. Equation (7.21) raises several issues:

The initial attractiveness model predicts a power law (7.18) or a logarithmic k-dependence (7.20) for the degree correlation function. A similar power law is derived in (7.16) for the hidden parameter model. Consequently, r forces a linear fit to an inherently nonlinear function. This linear dependence is not supported by numerical simulations or analytical calculations. Indeed, as we show in Image 7.19, (7.21) offers a poor fit to the data for both assortative and disassortative networks.
As we have seen in Image 7.10, the dependence of k_nn(k) on k is complex, often changing trends for large k thanks to the structural cutoff. A linear fit ignores this inherent complexity.
The maximally correlated model has a vanishing r for large N, despitethe fact that the network maintains its degree correlations (BOX 7.7).This suggests that the degree correlation coefficient has difficulty detectingcorrelations characterizing large networks.

Network	N	r	μ
Internet	192,244	0.02	0.56
WWW	325,729	-0.05	-1.11
Power Grid	4,941	0.003	0.0
Mobile Phone Calls	36,595	0.21	0.33
Email	57,194	-0.08	-0.74
Science Collaboration	23,133	0.13	0.16
Actor Network	702,388	0.31	0.34
Citation Network	449,673	-0.02	-0.18
E. Coli Metabolism	1,039	-0.25	-0.76
Protein Interactions	2,018	0.04	-0.1

Table 7.1
Degree Correlations in Reference Networks
The table shows the estimated r and μ for the ten reference networks. Directed networks were made undirected to measure r and μ. Alternatively, we can use the directed correlation coefficient to characterize such directed networks (BOX 7.8).ures.

Relationship Between μ and r

On the positive side, r and μ are not independent of each other. To show this we calculated r and μ for the ten reference networks (TABLE 7.1). The results are plotted in Image 7.20, indicating that μ and r correlate for positive r. Note, however, that this correlation breaks down for negative r. To understand the origin of this behavior, next we derive a direct relationship between μ and r. To be specific we assume the validity of (7.10) and determine the value of r for a network with correlation exponent μ.

We start by determining a from (7.10). We can write the second moment of the degree distribution as

$\left\langle {k^2 } \right\rangle = \left\langle {k_{nn} (k)k} \right\rangle = \sum\limits_k {ak^{\mu + 1} p_k = a\left\langle {k^{\mu + 1} } \right\rangle }$

which leads to

$a = \frac{{\left\langle {k^2 } \right\rangle }}{{\left\langle {k^{\mu + 1} } \right\rangle }}$

We now calculate r for a network with a given μ:

$r = \frac{{\sum\limits_k {kak^\mu q_k - \frac{{\left\langle {k^2 } \right\rangle ^2 }}{{\left\langle k \right\rangle ^2 }}} }}{{\sigma _r^2 }} = \frac{{\sum\limits_k {ak^{\mu + 2} \frac{{p_k }}{{\left\langle k \right\rangle }} - \frac{{\left\langle {k^2 } \right\rangle ^2 }}{{\left\langle k \right\rangle ^2 }}} }}{{\sigma _r^2 }} = \frac{{\frac{{\left\langle {k^2 } \right\rangle }}{{\left\langle {k^{\mu + 1} } \right\rangle }}\frac{{\left\langle {k^{\mu + 1} } \right\rangle }}{{\left\langle k \right\rangle }} - \frac{{\left\langle {k^2 } \right\rangle ^2 }}{{\left\langle k \right\rangle ^2 }}}}{{\sigma _r^2 }} = \frac{1}{{\sigma _r^2 }}\frac{{\left\langle {k^2 } \right\rangle }}{{\left\langle k \right\rangle }}\left( {\frac{{\left\langle {k^{\mu + 2} } \right\rangle }}{{\left\langle {k^{\mu + 1} } \right\rangle }} - \frac{{\left\langle {k^2 } \right\rangle }}{{\left\langle k \right\rangle }}} \right) \hspace{20 mm} (7 . 22)$

For μ = 0 the term in the last parenthesis vanishes, obtaining r = 0. Hence if μ = 0 (neutral network), the network will be neutral based on r as well. For k > 1 (7.22) suggests that for μ > 0 the parenthesis is positive, hence r > 0, and for μ < 0 the parenthesis is negative, hence r < 0. Therefore r and μ predict degree correlations of similar kind.

Correlation Between r and N. — Image 7.20
Correlation Between r and N
To illustrate the relationship between r and μ, we estimated μ by fitting the *k_nn(k)* function to (7.10), whether or not the power law scaling was statistically significant.

In summary, if the degree correlation function follows (7.10), then the sign of the degree correlation exponent μ will determine the sign of the coefficient r:

$\begin{array}{l} \mu < 0 \to r < 0 \\ \mu = 0 \to r = 0 \\ \mu > 0 \to r > 0 \\ \end{array}$

Directed Networks

To measure correlations in directed networks we must take into account that each node i is characterized by an incoming k_iⁱⁿ and an outgoing em>k_i^out degree. We therefore define four degree correlation coefficients, r_in,in, r_in,out, r_out,in, r_out,out, capturing all possible combinations between the incoming and outgoing degrees of two connected nodes (Image 7.21a-d). Formally we have [14]

$r_{\alpha ,\beta } = \frac{{\sum\limits_{jk} {jk} \left( {e_{jk}^{\alpha ,\beta } - q_{ \leftarrow j}^\alpha q_{ \to k}^\beta } \right)}}{{\sigma _ \leftarrow ^\alpha \sigma _ \to ^\beta }} \hspace{20 mm} (7 . 23)$

where α and β refer to the in and out indices and q_←j^α in the probability of finding a node with α-degree j by following a random link backward and q_→k^β in the probability of finding a β-link with degree k by following a random link forward. σ_←^α and σ_→^β are the corresponding standard deviations. To illustrate the use of (7.23), in Image 7.21e we show the four correlation coefficients for the five directed reference networks (TABLE 7.1). Note, however, that for a complete characterization of degree correlations it is desirable to measure the four k_nn(k) functions as well (BOX 7.3)

Box 7.7
The Problem With Large Networks

The Xalvi-Brunet & Sokolov algorithm helps us calculate the maximal (rmin) and the minimal (rmax) correlation coefficient for a scale-free network, obtaining [21]

$r_{\min } \sim \left\{ \begin{array}{l} - c_1 (\gamma ,k_0 ) \hspace{10 mm} \ for \hspace{5 mm} \ \gamma < 2 \\ - N^{(2 - \gamma )/(\gamma - 1)} \hspace{5 mm} \ for \hspace{5 mm} \ 2 < \gamma < 3 \\ - N^{(\gamma - 4)/(\gamma - 1)} \hspace{5 mm} \ for \hspace{5 mm} \ 3 < \gamma < 4 \\ - c_2 (\gamma ,k_0 ) \hspace{10 mm} \ for \hspace{5 mm} \ 4 < \gamma \\ \end{array} \right.$ $r_{\max } \sim \left\{ \begin{array}{l} - N^{( - \gamma - 2)/(\gamma - 1)} \hspace{5 mm} \ for \hspace{5 mm} \ 2 < \gamma < \gamma _r \\ - N^{ - 1/(\gamma ^2 - 1)} \hspace{5 mm} \ for \hspace{5 mm} \ \gamma _r < \gamma < 3 \\ \end{array} \right.$

where

$\gamma _r \approx \frac{1}{2} + \sqrt {17/4} \approx 2.56$

These expressions indicate that:

For large N both r_min and r_max vanish, even though the corresponding networks were rewired to have maximal correlations. Consequently the correlation coefficient r is unable to capture the correlations present in large networks.
Scale-free networks with γ < 2.6 always have negative r. This is a consequence of structural correlations (SECTION 7.4).

Given r’s limitations, we must inspect k_nn(k) to best characterize a large network's degree correlations.

In summary, the degree correlation coefficient assumes that k_nn(k) scales linearly with k, a hypothesis that lacks numerical and analytical support. Analytical calculations predict the power-law form (7.10) or the weaker logarithmic dependence (7.20). Yet, in general the sign of r and μ do agree. Consequently, we can use r to get a quick sense of the nature of the potential correlations present in a network. Yet, the accurate characterization of the underlying degree correlations requires us to measure k_nn(k).

Directed Correlation. — Image 7.21
Directed Correlation

**(a)-(d)** The purple and green links indicate the α, β indices that define the appropriate correlation coefficient for a directed network.

**(e)** The correlation profile of the five directed networks. While citation networks have negligible correlations, all four correlation coefficients document strong assortative behavior for mobile phone calls and strong disassortative behavior for metabolic networks. The case of the WWW is interesting: while three of its correlation coefficients are close to zero, there is a strong assortative tendency for the *in-out* degree combination.

Section 7.11
Advanced Topic 7.B
Structural Cutoffs

As discussed in SECTION 7.4, the fundamental conflict between the scalefree property and degree correlations leads to a structural cutoff in simple networks. In this section we derive (7.15), calculating how the structural cutoff depends on the system size N [11].

We start by defining

$r_{kk'} = \frac{{E_{kk'} }}{{m_{kk'} }} \hspace{20 mm} (7 . 24)$

where E_kk′ is the number of links between nodes of degrees k and k’ for k≠k’ and twice the number of connecting links for k=k’, and

$m_{kk'} = \min \{ kN_k ,k'N_k ,N_k N_k \} \hspace{20 mm} (7 . 25)$

is the largest possible value of E_kk′. The origin of (7.25) is explained in Image 7.22. Consequently, we can write r_kk’ as

$r_{kk'} = \frac{{E_{kk'} }}{{m_{kk'} }} = \frac{{\left\langle k \right\rangle e_{kk'} }}{{\min \{ kP(k),k'P(k'),NP(k)P(k')\} }} \hspace{20 mm} (7 . 26)$

As m_kk’ is the maximum of E_kk’, we must have r_kk’ ≤ 1 for any k and k’. Strictly speaking, in simple networks degree pairs for which r_kk’ > 1 cannot exist. Yet, for some networks and for some k, k’ pairs r_kk’ is larger than one. This is clearly non-physical and signals some conflict in the network configuration. Hence, we define the structural cutoff k_s as the solution of the equation

$r_{k_s k_s } = 1 \hspace{20 mm} (7 . 27)$

Note that as soon as k > Np_k’ and k’ > Np_k , the effects of the restriction on the multiple links are felt, turning the expression for r_kk′ into

$r_{kk'} = \frac{{\left\langle k \right\rangle e_{kk'} }}{{Np_k p_k }} \hspace{20 mm} (7 . 28)$

For scale-free networks these conditions are fulfilled in the region k, k’ > (aN)^1/(γ+1), where a is a constant that depends on p_k. Note that this value is below the natural cutoff. Consequently this scaling provides a lower bound for the structural cutoff, in the sense that whenever the cutoff of the degree distribution falls below this limit, the condition r_kk’ < 1 is always satisfied.

For neutral networks the joint distribution factorizes as

$e_{kk'} = \frac{{kk'p_k p_{k'} }}{{\left\langle k \right\rangle ^2 }} \hspace{20 mm} (7 . 29)$

Hence, the ratio (7.28) becomes

$r_{kk'} = \frac{{kk'}}{{\left\langle k \right\rangle N}} \hspace{20 mm} (7 . 30)$

Therefore, the structural cutoff needed to preserve the condition r_kk’ ≤ 1 has the form [11, 34, 35, 36]

$k_s (N) \sim \left( {\left\langle k \right\rangle N} \right)^{1/2} \hspace{20 mm} (7 . 31)$

which is (7.15). Note that (7.31) is independent of the degree distribution of the underlying network. Consequently, for a scale-free network ks(N) is independent of the degree exponent γ.

Calculating mkk'. — Image 7.22
Calculating *m_kk'*

The maximum number of links one can have between two groups. The figure shows two groups of nodes, with degree k=3 and k’=2. The total number of links between these two groups must not exceed:

The total number of links available in k=3 group, which is *kN_k*=9.

The total number of links available in k’=2 group, which is *k’N_k’*=8.

The total number of links one can potentially place between the two groups, which is *N_kN_k’*.

In the example shown above the smallest of the three is *k’N_k'*= 8 of (b).

Section 7.12
Bibliography

[1] P. Uetz, L. Giot, G. Cagney, T. A. Mansfield, RS Judson, JR Knight, D. Lockshon, V. Narayan, M. Srinivasan, P. Pochart, A. Qureshi-Emili, Y. Li, B. Godwin, D. Conover, T. Kalbfleisch, G. Vijayadamodar, M. Yang, M. Johnston, S. Fields, J. M. Rothberg. A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae. Nature, 403: 623–627, 2000.

[2] I. Xenarios, D. W. Rice, L. Salwinski, M. K. Baron, E. M. Marcotte, D. Eisenberg. DIP: the database of interacting proteins. Nucleic Acids Res., 28: 289–29, 2000.

[3] H. Jeong, S.P. Mason, A.-L. Barabási, and Z.N. Oltvai. Lethality and centrality in protein networks. Nature, 411: 41-42, 2001.

[4] R. Pastor-Satorras, A. Vázquez, and A. Vespignani. Dynamical and correlation properties of the Internet. Phys. Rev. Lett., 87: 258701, 2001.

[5] A. Vazquez, R. Pastor-Satorras, and A. Vespignani. Large-scale topological and dynamical properties of Internet. Phys. Rev., E 65: 066130, 2002.

[6] S.L. Feld. Why your friends have more friends than you do. American Journal of Sociology, 96: 1464–1477, 1991.

[7] E.W. Zuckerman and J.T. Jost. What makes you think you’re so popular? Self evaluation maintenance and the subjective side of the “friendship paradox”. Social Psychology Quarterly, 64: 207–223, 2001.

[8] M. E. J. Newman. Assortative mixing in networks. Phys. Rev. Lett., 89: 208701, 2002.

[9] M. E. J. Newman. Mixing patterns in networks. Phys. Rev. E, 67: 026126, 2003.

[10] S. Maslov, K. Sneppen, and A. Zaliznyak. Detection of topological pattern in complex networks: Correlation profile of the Internet. Physica A, 333: 529-540, 2004.

[11] M. Boguna, R. Pastor-Satorras, and A. Vespignani. Cut-offs and finite size effects in scale-free networks. Eur. Phys. J. B, 38: 205, 2004.

[12] M. E. J. Newman and Juyong Park. Why social networks are different from other types of networks. Phys. Rev. E, 68: 036122, 2003.

[13] M. McPherson, L. Smith-Lovin, and J. M. Cook. Birds of a feather: homophily in social networks. Annual Review of Sociology, 27:415-444, 2001.

[14] J. G. Foster, D. V. Foster, P. Grassberger, and M. Paczuski. Edge direction and the structure of networks. PNAS, 107: 10815, 2010.

[15] A. Barrat and R. Pastor-Satorras. Rate equation approach for correlations in growing network models. Phys. Rev. E, 71: 036127, 2005.

[16] S. N. Dorogovtsev and J. F. F. Mendes. Evolution of networks. Adv. Phys., 51: 1079, 2002.

[17] J. Berg and M. Lässig. Correlated random networks. Phys. Rev. Lett., 89: 228701, 2002.

[18] M. Boguñá and R. Pastor-Satorras. Class of correlated random networks with hidden variables. Phys. Rev. E, 68: 036112, 2003.

[19] R. Xulvi-Brunet and I. M. Sokolov. Reshuffling scale-free networks: From random to assortative. Phys. Rev. E, 70: 066102, 2004.

[20] R. Xulvi-Brunet and I. M. Sokolov. Changing correlations in networks: assortativity and dissortativity. Acta Phys. Pol. B, 36: 1431, 2005.

[21] J. Menche, A. Valleriani, and R. Lipowsky. Asymptotic properties of degree-correlated scale-free networks. Phys. Rev. E, 81: 046103, 2010.

[22] V. M. Eguíluz and K. Klemm. Epidemic threshold in structured scale-free networks. Phys. Rev. Lett., 89:108701, 2002.

[23] M. Boguñá and R. Pastor-Satorras. Epidemic spreading in correlate complex networks. Phys. Rev. E, 66: 047104, 2002.

[24] M. Boguñá, R. Pastor-Satorras, and A. Vespignani. Absence of epidemic threshold in scale-free networks with degree correlations. Phys. Rev. Lett., 90: 028701, 2003.

[25] A. Vázquez and Y. Moreno. Resilience to damage of graphs with degree correlations. Phys. Rev. E, 67: 015101R, 2003.

[26] S.J. Wang, A.C. Wu, Z.X. Wu, X.J. Xu, and Y.H. Wang. Response of degree-correlated scale-free networks to stimuli. Phys. Rev. E, 75: 046113, 2007.

[27] F. Sorrentino, M. Di Bernardo, G. Cuellar, and S. Boccaletti. Synchronization in weighted scale-free networks with degree–degree correlation. Physica D, 224: 123, 2006.

[28] M. Di Bernardo, F. Garofalo, and F. Sorrentino. Effects of degree correlation on the synchronization of networks of oscillators. Int. J. Bifurcation Chaos Appl. Sci. Eng., 17: 3499, 2007.

[29] A. Vazquez and M. Weigt. Computational complexity arising from degree correlations in networks. Phys. Rev. E, 67: 027101, 2003.

[30] M. Posfai, Y Y. Liu, J-J Slotine, and A.-L. Barabási. Effect of correlations on network controllability. Scientific Reports, 3: 1067, 2013.

[31] M. Weigt and A. K. Hartmann. The number of guards needed by a museum: A phase transition in vertex covering of random graphs. Phys. Rev. Lett., 84: 6118, 2000.

[32] S. Maslov and K. Sneppen. Specificity and stability in topology of protein networks. Science, 296: 910–913, 2002.

[33] L. Adamic and N. Glance. The political blogosphere and the 2004 U.S. election: Divided they blog (2005).

[34] J. Park and M. E. J. Newman. The origin of degree correlations in the Internet and other networks. Phys. Rev. E, 66: 026112, 2003.

[35] F. Chung and L. Lu. Connected components in random graphs with given expected degree sequences. Annals of Combinatorics, 6: 125, 2002.

[36] Z. Burda and Z. Krzywicki. Uncorrelated random networks. Phys. Rev. E, 67: 046118, 2003.

Section 7.1 Introduction

Section 7.2 Assortativity and Disassortativity

Section 7.3 Measuring Degree Correlations

Section 7.4 Structural Cutoffs

Section 7.5 Correlations in Real Networks

Section 7.6 Generating Correlated Networks