Founded six years after birth of the World Wide Web, Google was a latecomer to search. By the late 1990s Alta Vista and Inktomi, two search engines with an early start, have dominated the search market. Yet, the third mover Google soon not only became the leading search engine, but acquired links at such an incredible rate that by 2000 became the biggest hub of the Web as well. But it didn’t last: In 2011 Facebook, a youngster with Google’s standards, took over as the Web’s biggest node.
The Web’s competitive landscape highlights an important limitation of our modeling framework: None of the network models we encountered so far are able to account for it. Indeed, in the Erdős-Rényi model the identity of the biggest node is driven entirely by chance. The Barabási-Albert model offers a more realistic picture, predicting that each node increases its degree following k(t) ~ t1/2. This means that the oldest node always has the most links, a phenomena called the first mover’s advantage in the business literature. It also means that a late node can never become the largest hubs.
In reality the growth rate of a node does not depend on its age alone. Instead webpages, companies, or actors have intrinsic qualities that influence the rate at which they acquire links. Some show up late and nevertheless grab an extraordinary number of links within a short timeframe. Others rise early yet never quite make it. The goal of this chapter is to understand how the differences in the node’s ability to acquire links affect the network topology. Going beyond this competitive landscape, we also explore how other processes, like node and link deletion (Image 6.1) or the aging of nodes, phenomena frequently observed in real networks, change the way networks evolve and alter their topology. Our goal is to develop a self-consistent theory of evolving networks that can be adjusted at will to predict the dynamics and the topology of a wide range of real networks.
The Garment District is a Manhattan neighborhood located between Fifth and Ninth Avenue, from 34th to 42nd Street. Since the early 20th century it has been the center for fashion manufacturing and design in the United States. The Needle threading a button and the Jewish Tailor, two sculptures located in the heart of the district, pay tribute to the neighborhood’s past.
The garment industry of New York City offers a prominent example of a declining network, helping us understand how the loss of nodes shapes a network’s topology (BOX 6.5). Uncovering the impact of processes like node and link loss on the network topology is one of the goals of this chapter.
Some people have a knack for turning each random encounter into a lasting social link; some companies turn each consumer into a loyal partner; some webpages turn visitors into addicts. A common feature of these successful nodes is some intrinsic property that propels them ahead of the pack. We will call this property fitness.
Fitness is an individual’s gift to turn a random encounter into a lasting friendship; it is a company’s knack to acquire consumers relative to its competition; it is a webpage’s ability to bring us back on a daily basis despite the many other pages that compete for our attention. Fitness may have genetic roots in people, it may be related to innovativeness and management quality in companies and may depend on the content offered by a website.
In the Barabási-Albert model we assumed that a node’s growth rate is determined solely by its degree. To incorporate the role of fitness we assume that preferential attachment is driven by the product of a node’s fitness, η, and its degree k. The resulting model, called the Bianconi-Barabási or the fitness model, consists of the following two steps [2, 3]:
In (6.1) the dependence of Πi on ki captures the fact that higher-degree nodes have more visibility, hence we are more likely to link to them. The dependence of Πi on ηi implies that between two nodes with the same degree, the one with higher fitness is selected with a higher probability. Hence, (6.1) assures that even a relatively young node, with initially only a few links, can acquire links rapidly if it has larger fitness than the rest of the nodes.
We can use the continuum theory to predict each node’s temporal evolution. According to (6.1), the degree of node i changes at the rate
Let us assume that the time evolution of ki follows a power law with a fitness-dependent exponent β(ηi) (
Inserting (6.3) into (6.2) we find that the dynamic exponent satisfies (ADVANCED TOPICS 6.A)
In the Barabási-Albert model we have β = 1/2, hence the degree of each node increases as a square root of time. According to (6.4), in the Bianconi-Barabási model the dynamic exponent is proportional to the node’s fitness, η, hence each node has its own dynamic exponent. Consequently, a node with a higher fitness will increase its degree faster. Given sufficient time, the fitter node will leave behind nodes with a smaller fitness (
In (a)-(d) each curve corresponds to average over 100 independent runs using the same fitness sequence.
The degree distribution of the network generated by the Bianconi- Barabási model can be calculated using the continuum theory (ADVANCED TOPICS 6.A), obtaining
Equation (6.6) is a weighted sum of multiple power-laws, indicating that pk depends on the precise form of the fitness distribution, ρ(η). To illustrate the properties of the model we use (6.4) and (6.6) to calculate β(η) and pk for two fitness distributions
whose numerical solution is C* = 1.255. Consequently, (6.4) predicts that each node i has a different dynamic exponent, β(ηi) = ηi /C*.
Using (6.6) we obtain
predicting that the degree distribution follows a power law with degree exponent γ = 2.255. Yet, we do not expect a perfect power law, but the scaling is affected by an inverse logarithmic correction 1/lnk.
Numerical support for the above predictions is provided in Figures 6.2 and 6.3. The simulations confirm that ki(t) follows a power law for each η and that the dynamical exponent β(η) increases with the fitness η. As Image 6.3a indicates, the measured dynamical exponents are in excellent agreement with the prediction (6.4). Image 6.3b also documents an agreement between (6.8) and the numerically obtained degree distribution.
In (a)-(d) each curve corresponds to average over 100 independent runs using the same fitness sequence.
In summary, the Bianconi-Barabási model can account for the fact that nodes with different internal characteristics acquire links at different rates. It predicts that a node’s growth rate is determined by its fitness η and allows us to calculate the dependence of the degree distribution on the fitness distribution ρ(η).
Measuring a node’s fitness could help us identify web sites that are poised to grow in visibility, research papers that will become influential, or actors on their way to stardom (BOX 6.1). Yet, our ability to determine the fitness is prone to errors. Consider the challenge of assigning fitness to a webpage on sumo wrestling: While a small segment of the population might find sumo wrestling fascinating, most individuals are indifferent to it and some might even find it odd. Hence, different individuals will inevitably assign different fitnesses to the same node.
According to (6.1) fitness is not assigned by any individual, but reflects the network’s collective perception of a node’s importance relative to the other nodes. We can, therefore, determine a node’s fitness by comparing its time evolution to the time evolution of other nodes in the network. In this section we show that if we have dynamical information about the evolution of the individual nodes, the quantitative framework of the BianconiBarabási model allows us to determine the fitness of each node.
To relate a node’s growth rate to its fitness we take the logarithm of (6.3),
where Bi = ln (m/ti β(ηi)) is a time-independent parameter. Hence, the slope of ln k(t,ti,ηi) is a linear function of the dynamical exponent β(ηi). In turn β(ηi) depends linearly on ηi according to (6.4). Therefore, if we can track the time evolution of the degree for a large number of nodes, the distribution of the dynamical exponent β(ηi) will be identical with the fitness distribution ρ(η).
Node fitnesses were systematically measured in the context of the WWW, relying on a dataset that crawled monthly the links of about 22 million web documents for 13 months [9]. While most nodes (documents) did not change their degree during this time frame, 6.5% of nodes showed sufficient changes to determine their dynamical exponent via (6.9). The obtained fitness distribution ρ(η) has an exponential form (Image 6.4), indicating that high fitness nodes are rare.
The shape of the obtained fitness distribution is somewhat unexpected, as one would be tempted to assume that on the web fitness varies widely: For example Google is far more attrative to Web surfers than my personal webpage. Yet the exponential form of ρ(η)ρ indicates that the fitness of Web documents is bounded, varying in a relativelly narrow range. Consequently, the observed large differences in the degree of two web documents is generated by the system’s dynamics: Growth and preferential attachment amplifies the small fitness differences, turning nodes with slightly higher fitness into much bigger nodes.
To illustrate this amplification, consider two nodes that arrived at the same time, but have different fitnesses η2 > η1. According to (6.3) and (6.4), the relative difference between their degrees grows for large t as
While the difference between η2 and η1 may be small, far into the future (large t) the relative difference between their degrees can become quite significant.
In some networks the nodes follow a more complex dynamics than the one predicted by (6.3). To measure their fitness we must first account for their precise growth law. We illustrate this procedure by determining the fitness of a research publication, allowing us to predict its future impact.
While most research papers acquire only a few citations, a small number of publications collect thousands and even tens of thousands of citations [10]. These impact differences mirror differences in the novelty and the relevance of various publications. In general, the probability that a research paper i is cited at time t after publication is [11]
where the paper’s fitness ηi accounts for the perceived novelty and importance of the reported discovery; cit is the cumulative number of citations acquired by paper i at time t after publication, accounting for the fact that well-cited papers are more likely to be cited than less-cited contributions (preferential attachment). The last term in (6.11) captures the fact that new ideas are integrated into subsequent work, hence the novelty of each paper fades with time [11, 12]. Measurements indicate that this decay has the log-normal form
By solving the master equation behind (6.11) we obtain the time-dependent growth of a paper’s citations
where
is the cumulative normal distribution and m, β, and A are global parameters.
Equation (6.13) predicts that the citation history of paper i is characterized by three parameters: the immediacy μi, governing the time for a paper to reach its citation peak and the longevity σi, capturing the decay rate. The most important is the relative fitness ηi′ ≡ ηi β/A, which measures a paper’s importance relative to other papers and determines its ultimate impact (BOX 6.2).
We fit (6.13) to the citation history of individual papers published by a journal to obtain the journal’s fitness distribution (Image 6.5). The measurements indicate that the fitness distribution of the top cell biology journal, Cell, is shifted to the right, indicating that Cell papers tend to have high fitness. Not surprisingly, the journal has one of the highest impact factors of all journals. By comparison the fitness of papers published in Physical Review are shifted to the left, indicating that the journal publishes fewer high fitness papers.
In summary, the framework offered by the Bianconi-Barabási model allows us to determine the fitness of individual nodes and the shape of the fitness distribution ρ(η). The measurements show that the fitness distribution is typically exponentially bounded, meaning that fitness differences between different nodes are small. With time these differences are magnified, resulting in a power law degree distribution of incoming links in the case of the WWW or a broad citation distribution in citation networks.
The fitness distribution of papers published in six journals in 1990. Each paper’s fitness was obtained by fitting (6.13) to the paper’s citation history for a decade long time interval. Two journals are from physics (Physical Review B and Physical Review Letters), one from biology (Cell) and three are interdisciplinary (Nature, Science, and PNAS).
The obtained fitness distributions are shifted relative to each other, indicating that Cell publishes the highest fitness papers, followed by Nature, Science, PNAS, Physical Reviews Letters and Physical Review B. After [11].
In the previous section we found that the Web’s fitness distribution follows a simple exponential (Image 6.4), while the fitness of research papers follows a peaked distribution (Image 6.5). The diversity of the observed fitness distributions raises an important question: How does the network topology depend on the shape of ρ(η)?
Technically, the answer is provided by (6.6) that links pk to ρ(η). Yet, the true impact of the fitness distribution was realized only after the discovery that some networks can undergo Bose-Einstein condensation. In the section we discuss the mapping that lead to this discovery and its consequences for the network topology [15].
We start by establishing a formal link between the Bianconi-Barabási model and a Bose gas, whose properties have been extensively studied in physics (Image 6.7):
Network
A network of six nodes, where each node is characterized by a unique fitness ηi indicated by the node color. The fitnesses are chosen from the fitness distribution ρ(η).
Bose Gas
The mapping assigns an energy level ε to each fitness η, resulting in a Bose gas with random energy levels. A link going from a new node i to node j corresponds to one particle at level εj.
Growth
The network grows by adding a new node, like the orange node with fitness η6. For m=1 the new node connects to the grey node (dashed link), chosen following (6.1). In the Bose gas this corresponds to the addition of a new energy level ε6 (dashed line), and the deposition of a particle at ε1, the energy level of node 1 to which node η6 links to.
If we follow the mathematical consequences of this mapping, we find that in the resulting gas the number of particles on each energy level follows Bose statistics, a formula derived by Satyendra Nath Bose in 1924 (BOX 6.3). Consequently, the links of the fitness model behave like subatomic particles in a quantum gas.
The mapping to a Bose gas is exact and predicts the existence of two distinct phases [15, 16]:
For most fitness distributions the network displays a fit-gets-rich dynamics, meaning that the degree of each node is ultimately determined by its fitness. While the fittest node inevitably becomes the largest hub, in this phase at any moment the degree distribution follows a power law, indicating that the generated network has a scalefree topology. Consequently the largest hub follows (4.18), growing only sublinearly. This hub is closely trailed by a few slightly smaller hubs, with almost as many links as the fittest node (Image 6.9a). The model with uniform fitness distribution discussed in SECTION 6.2 is in this scale-free phase.
The unexpected outcome of the mapping to a Bose gas is the possibility of a Bose-Einstein condensation for some fitness distributions. In a Bose-Einstein condensate all particles crowd to the lowest energy level, leaving the rest of the energy levels unpopulated (BOX 6.4).
In a network Bose-Einstein condensation means that the fittest node grabs a finite fraction of the links, turning into a super-hub (Image 6.9b). The resulting network is not scale-free but has a huband-spoke topology. In this phase the rich-gets-richer process is so dominant that it becomes a qualitatively different winner takes-all phenomenon. Consequently, the network will loose its scale-free nature.
In physical systems Bose-Einstein condensation is induced by lowering the temperature of the Bose gas below some critical temperature (BOX 6.4). In networks, the temperature βT in (6.16) is a dummy variable, disappearing from all topologically relevant quantities, like the degree distribution pk. Hence, the presence or the absence of Bose-Einstein condensation depends only on the form of the fitness distribution ρ(η). For a network to undergo Bose-Einstein condensation, the fitness distribution needs to satisfy the condition
A fitness distribution that leads to a Bose-Einstein condensation is
whereby varying ζ we can induce a Bose-Einstein condensation (Image 6.9). Indeed, whether (6.20) has a solution depends on the functional form of the energy distribution, g(ε), which is determined by the shape of ρ(η). Specifically, if (6.22) has no non-negative solution for a given g(ε), a Bose-Einstein condensation emerges, and a finite fraction of the particles agglomerate at the lowest energy level.
In summary, the precise shape of the fitness distribution determines the topology of a growing network. While fitness distributions like the uniform distribution lead to a scale-free topology, some ρ(η) allow for Bose-Einstein condensation. If a network undergoes a Bose-Einstein condensation, then one or a few nodes grab most of the links. Hence, the rich-gets-richer process responsible for the scale-free state turns into a winner-takes-all phenomenon. The Bose-Einstein condensation has such an obvious impact on a network’s structure that, if present, it is hard to miss: it destroys the hierarchy of hubs characterizing a scale-free network, forcing the network into a star-like hub-and-spoke topology (Image 6.9).
(a,b) A scale-free network (a) and a network that has undergone a Bose-Einstein condensation (b). Both networks were generated by the Bianconi-Barabási model with ρ(η) following (6.22), but with different exponent ζ. Note that in the condensed phase (b) we have two large hubs with comparable size.
(c,d) The energy levels (green lines) and the deposited particles (purple dots) for a network with m=2 and N=1,000. Each energy level corresponds to the fitness of a node of the network shown in (a,b). Each link connected to a node is represented by a particle on the corresponding energy level. As we did not allow multi-links, two highly populated energy levels emerged in (d), indicating that we have two condensates, corresponding to the two hubs seen in (b).
(e,f) The fitness distribution ρ(η), given by (6.22), illustrates the difference in the hape of the two ρ(η) functions. The difference is determined by the parameter ζ, which is (e) ζ = 0.1 and (f) ζ = 10.
The Barabási-Albert model is a minimal model, designed to capture the mechanisms responsible for the emergence of the scale-free property. Consequently, it has several well-known limitations (see also SECTION 5.10):
These limitations have inspired considerable research, clarifying the role of the numerous elementary processes that influence the network topology. In this section we systematically extend the Barabási-Albert model, arriving to a family of evolving network models that can capture the wide range of phenomena known to shape the topology of real networks.
In the Barabási-Albert model an isolated node cannot acquire links, as according to preferential attachment (4.1) the likelihood that a new node attaches to a k=0 node is strictly zero. In real networks even isolated nodes acquire links. Indeed, each new research paper has a finite probability of being cited for the first time; a person that moves to a new city quickly acquires acquaintances. To allow unconnected nodes to acquire links we add a constant to the preferential attachment function (4.1),
Here the constant A is called initial attractiveness. As Π(0) ~ A, initialattractiveness is proportional to the probability that a node acquires its first link in the next time step.
Direct measurement of Π(k) shows that initial attractiveness is present in real networks (Image 6.10). Once present, it has two consequences:
In many networks new links do not only arrive with new nodes but are added between pre-existing nodes. For example, the vast majority of new links on the WWW are internal links, corresponding to newly added URLs between pre-existing web documents. Similarly, virtually all new social/friendship links form between individuals that already have other friends and acquaintances.
Measurements show that in collaboration networks the internal links follow double preferential attachment, i.e. the probability for a new internal link to connect nodes with degrees k and k’ is [20]
To understand the impact of internal links we explore the limiting cases of (6.26):
In many real systems nodes and links can disappear. For example, nodes are deleted from an organizational network when employees leave the company or from the WWW when web documents are removed. At the same time in some networks node removal is virtually impossible (Image 6.11).
The citation history of a research paper by Jan Hendrik Schön published in Science [23] illustrates how difficult it is to remove a node from the citation network. Schön rose to prominence after a series of breakthroughs in the area of semiconductors. His productivity was phenomenal: In 2001 he has coauthored one research paper every eight days, published by the most prominent scientific journals, like Science and Nature.
Soon after Schön published a paper reporting a groundbreaking discovery on single-molecule semiconductors, researchers noticed that he reported for two experiments, carried out at different temperatures, identical noise [24]. The ensuing questions prompted Lucent Technologies, which ran Bell Labs where Schön worked, to start a formal investigation. Eventually Schön admitted falsifying data. Several dozens of his papers, like the one whose citation pattern is shown in this figure, were retracted.
While the papers’ formal retraction lead to a dramatic drop in citations, the papers continue to be cited after their official “deletion” from the literature, as seen in the figure above. This indicates that it is virtually impossible to remove a node from the citation network.
To explore the impact of node removal, we start from the Barabási-Albert model. In each time step we add a new node with m links and with rate r we remove a node. Depending on r, we observe three distinct scaling regimes [25-30]:
The coexistence of node removal with other elementary processes can lead to interesting topological phase transitions. This is illustrated by a simple model in which the network’s growth is governed by (6.23), and we also remove nodes with rate r [30]. The network displays three distinct phases, captured by the phase diagram shown above, whose axes are the node removal rate r and initial attractiveness A:
Subcritical Node Removal: r < r*(A)
If the rate of node removal is under a critical value r*(A), shown as the white line on the figure, the network will be scale-free.
Critical Node Removal: r=r*(A)
Once r reaches a critical value r*(A), the degree distribution turns into a stretched exponential (SECTION 4.A).
Exponential Networks: r> r*(A)
The network looses its scale-free nature, developing an exponential degree distribution.
Therefore, the coexistence of multiple elementary processes in a network can lead to sudden changes in the network topology. To be specific, a continuous increase in the node removal rate leads to a phase transition from a scale-free to an exponential network.
The behavior of a network can be rather complex if node removal coexists with other elementary processes. This is illustrated in Image 6.12, indicating that the joint presence of initial attractiveness and node deletion induces phase transitions between scale-free and exponential networks. Finally, note that node removal is not always random, but can depend on the removed node’s degree (BOX 6.5).
In summary, in most networks nodes can disappear. Yet as long as the network continues to grow, its scale-free nature can persist. The degree exponent depends, however, on the details governing the node removal process.
In the models discussed so far the number of links increased linearly with the number of nodes. In other words, we assumed that L=〈k〉N/2, where 〈k〉 is independent of time. This is a reasonable assumption for many real networks. Yet, for some real networks the number of links grows faster than N, a phenomena called accelerated growth. For example the average degree of the Internet increased from 〈k〉=3.42 in November 1997 to 3.96 by December 1998 [34]; the WWW increased its average degree from 7.22 to 7.86 during a five month interval [35, 36]; in metabolic networks the average degree of the metabolites grows approximately linearly with the number of metabolites [37]. To explore the consequences of accelerated growth let us assume that in a growing network the number of links arriving with each new node follows [38-41]
For θ=0 each new node has the same number of links; for θ>0, however, the network follows accelerated growth.
The degree exponent of the Barabási-Albert model with accelerated growth (6.30) is
Hence, accelerated growth pushes the degree exponent beyond γ=3, making the network more homogenous. For θ=1 the degree exponent diverges, leading to hyper-accelerating growth [39]. In this case 〈k〉 grows linearly with time and the network looses its scale-free nature.
In many real systems nodes have a limited lifetime. For example, actors have a finite professional life span, defined as the period when they act in movies. So do scientists, whose professional lifespan typically corresponds to the time frame during which they continue to publish scientific papers. In these networks nodes do not disappear abruptly, but fade away through a slow aging process, gradually reducing the rate at which they acquire new links [42-45]. Capacity limitations can induce a similar phenomena: If nodes have finite resources to handle links, once they approach their limit, they will stop accepting new links [43].
To understand the impact of aging we assume that the probability that a new node connects to node i is Π(ki,t−ti), where ti is the time node i was added to the network. Hence, t−ti is the node’s age. Aging is often modeled by choosing [42]
where ν is a tunable parameter governing the dependence of the attachment probability on the node’s age. Depending on the value of ν we can distinguish three scaling regimes:
(a-d) A schematic illustration of the expected network topologies for various aging exponents ν in (6.32). In the context of a growing network we assume that the probability to attach to a node is proportional to kτ−ν, where τ is the age of the node. For negative ν nodes prefer to link to the oldest nodes, turning the network into a hub-and-spoke topology. For positive ν the most recent nodes are the most attractive. For large ν the network turns into a chain, as the last (i.e. the youngest) node is always the most attractive for the new node. The network is shown for m=1 for clarity but the degree exponent is independent of m.
(e) The degree exponent γ vs. the aging exponent ν predicted by the analytical solution of the aging model. The purple symbols are the result of simulations, each representing a single network with N=10,000 and m=1. Redrawn after Ref. [42].
In summary, the results discussed in this section indicate that a wide range of elementary processes can affect the structure and the dynamics of a growing network (Image 6.15). These results highlight the true power of the evolving network paradigm: It allows us to address, using a mathematically self-consistent and predictive framework, the impact of various processes on the network topology and evolution.
A summary of the elementary processes discussed in this section and their impact on the degree distribution. Each model is defined as extensions of the Barabási-Albert model.
As we showed in this chapter, rather diverse processes, from fitness to internal links and aging, can influence the structure of real networks. Through them we learned how to use the theory of evolving networks to predict the impact of various elementary events on a network’s topology and evolution. The discussed examples allow us to draw a key conclusion: if we want to understand the structure of a network we must first get its dynamics right. The topology is the bonus of this approach.
The developed tools allow us to reflect on a number of issues that we encountered in the past chapters, from the correct fit to the degree distribution to the role of the different modeling frameworks. Next we briefly discuss some of these issues.
In CHAPTER 4 we discussed the difficulties we encounter when we attempt to fit a pure power law to the degree distribution of a real network. The roots of this problem became obvious in this chapter: If we account for the real dynamical processes that contribute to the evolution of a network, we expect systematic deviations from a pure power law. Indeed, we predicted several analytical forms for the degree distribution:
In most real networks several of the elementary processes discussed in this chapter appear together. For example, in the scientific collaboration network we have sublinear preferential attachment with initial attractiveness and the links can be both external and internal. As researchers have different creativity, fitness also plays a role, hence an accurate model requires us to know the appropriate fitness distribution. Therefore, the degree distribution is expected to display small degree saturation (thanks to initial attractiveness), stretched exponential cutoff at high degrees (thanks to sublinear preferential attachment), and some unknown corrections due to the particular form of the fitness distribution ρ(η).
In general if wish to obtain an accurate fit to the degree distribution, we first need to build a generative model that analytically predicts the functional form of pk. Yet, in many systems developing an accurate theory for pk may be an overkill. It is often sufficient, instead, to establish if we are dealing with an exponentially bounded or a heavy tailed degree distribution (SECTION 4.9), as the system’s properties will be primarily driven by this distinction.
The results of this chapter also allow us to reflect on the role of the network models encountered so far. We can categorize these models into three main classes (Table 6.1):
Static Models
The random network model of Erdős and Rényi (CHAPTER 3) and the small-world network model of Watts and Strogatz (BOX 3.8) have a fixed number of nodes, prompting us to call them static models. They both assume that the role of the network modeler is to place the links between the nodes using some random algorithm. To explore their properties we need to rely on combinatorial graph theory, developed by Erdős and Rényi. Both models predict a bounded degree distribution.
Generative Models
The configuration and the hidden parameter models discussed in SECTION 4.8 generate networks with a predefined degree distribution. Hence, these models are not mechanistic, in the sense that they do not tell us why a network develops a particular degree distribution. Rather, they help us understand how various network properties, from clustering to path lengths, depend on the degree distribution.
Evolving Network Models
These models capture the mechanisms that govern the time evolution of a network. The most studied example is the Barabási-Albert model, but equally insightful are the extensions discussed in this chapter, from the Bianconi-Barabási model to models involving internal links, aging, node and link deletion, or accelerated growth. These models are motivated by the observation that if we correctly capture all microscopic processes that contribute to a network’s evolution, then the network’s topological characteristics follow from that. To explore the properties of the networks generated by them, we need to use dynamical methods like the continuum theory and the rate equation approach.
Each of these modeling frameworks have their important role in network theory. The Erdős-Rényi model allows us to check if a certain network property could be explained by a pure random connectivity pattern. If our interest is limited to the role of the network environment on some phenomena, like spreading processes or network robustness, the generative models offer an excellent starting point. If, however, we want to understand the origin of a network property, we must resort to evolving network models, that capture the processes that built the network in the first place.
| Model Class | Examples | Characteristics |
|---|---|---|
| Static Models | Erdos–Rényi Watts-Strogatz |
|
| Generative Models | Configuration Model Hidden Parameter Model |
|
| Evolving Network Models | Barabási–Albert Model Bianconi-Barabási Model Initial Attractiveness Model Internal Links Model Node Deletion Model Accelerated Growth Model Aging Model |
|
Calculate the degree exponent of the directed Barabási-Albert model with accelerated growth, assuming that the degree of the newly arriving nodes increases in time as m(t) = tΘ.
In the t-party gender play no role, hence each newcomer is allowed to invite only one other participants to a dance. However, attractiveness plays a role: More attractive participants are more likely to be invited to a dance by a new participant. The party evolves following these rules:
where 〈η〉 is the average attractiveness.
Consider the Bianconi-Barabási model with two distinct fitnesses, η = a and η = 1. To be specific, let us assume that the fitness follows the double delta distribution
Assume that the growth of a network is governed by preferential attachment with additive fitness where a different ηi is assigned to each node, chosen from a ρ(η) fitness distribution. Calculate and discuss the degree distribution of the resulting network.
The purpose of this section is to derive the degree distribution of the Bianconi-Barabási model [2, 15, 16,17]. We start by calculating
over all possible realizations of the quenched fitnesses η. Since each node is born at a different time t0, we can write the sum over j as an integral over t0
By replacing kη(t, t0) with (6.3) and performing the integral over t0, we obtain
The dynamic exponent β(η) is bounded, i.e. 0 < β(η) < 1, because a node can only increase its degree with time (β(η)>0) and ki(t) cannot increase faster than t (β(η) < 1). Therefore in the limit t→∞ in (6.35) the term tβ(n) can be neglected compared to t, obtaining
where ε = (1 − maxηβ(η)) > 0 and
Using (6.36) and the notation kη=kη(t, t0, η) we write the dynamic equation (6.2) as
which has a solution of the form (6.3), given that
confirming the self-consistent nature of the assumption (6.3).
To complete the calculation we need to determine C from (6.37). After substituting β(n) with η/C , we obtain
where ηmax is the maximum possible fitness in the system. The integral (6.40) is singular. However, since β(η)=η/C < 1 for any η, we have C > ηmax, thus the integration limit never reaches the singularity. Note also that since
we have C ≤ 2ηmax.
If there is a single dynamic exponent β, the degree distribution follows the power law pk ~ k−γ with degree exponent γ=1/β+1. In the Bianconi-Barabási model we have a spectrum of dynamic exponents β(η), thus pk is a weighted sum over different power-laws.
To determine the degree distribution in the large N limit, we first calculate the number of nodes with fitness η and with degree greater than k, i.e. those that satisfy kη(t) > k. Using (6.3) we find that this condition implies
Exactly one node is added at each time step and each node has probability ρ(η)dη to have fitness η. Therefore t(m/k)C/ηρ(η)dη nodes satisfy condition (6.42). To obtain the cumulative distribution function (the probability that a random node i has degree smaller or equal to k), we write
where the last equation is valid asymptotically, for large t. The probability density function for the degree distribution is
recovering (6.6).
[1] A.L. Barabási. Linked: The New Science of Networks. Perseus, Boston, 2001.
[2] G. Bianconi and A.-L. Barabási. Competition and multiscaling in evolving networks. Europhysics Letters, 54: 436-442, 2001.
[3] A.-L. Barabási, R. Albert, H. Jeong, and G. Bianconi. Power-law distribution of the world wide web. Science, 287: 2115, 2000.
[4] P.L. Krapivsky and S. Redner. Statistics of changes in lead node in connectivity-driven networks. Phys. Rev. Lett., 89:258703, 2002.
[5] C. Godreche and J. M. Luck. On leaders and condensates in a growing network. J. Stat. Mech., P07031, 2010.
[6] J. H. Fowler, C. T. Dawes, and N. A. Christakis. Model of Genetic Variation in Human Social Networks. PNAS, 106: 1720-1724, 2009.
[7] M. O. Jackson. Genetic influences on social network characteristics. PNAS, 106:1687–1688, 2009.
[8] S.A. Burt. Genes and popularity: Evidence of an evocative gene environment correlation. Psychol. Sci., 19:112–113, 2008.
[9] J. S. Kong, N. Sarshar, and V. P. Roychowdhury. Experience versus talent shapes the structure of the Web. PNAS, 105:13724-9, 2008.
[10] A.-L. Barabási, C. Song, and D. Wang. Handful of papers dominates citation. Nature, 491:40, 2012.
[11] D. Wang, C. Song, and A.-L. Barabási. Quantifying Long term scientific impact. Science, 342:127-131, 2013.
[12] M. Medo, G. Cimini, and S. Gualdi. Temporal effects in the growth of networks. Phys. Rev. Lett., 107:238701, 2011.
[13] C. Venter et al. The sequence of the human genome. Science, 291:1304-1351, 2001.
[14] A.-L. Barabási and R. Albert. Emergence of scaling in random networks. Science, 286:509-512, 1999.
[15] G. Bianconi and A.-L. Barabási. Bose-Einstein condensation in complex networks. Phys. Rev. Lett., 86: 5632–5635, 2001.
[16] C. Borgs, J. Chayes, C. Daskalakis, and S. Roch. First to market is not everything: analysis of preferential attachment with fitness. STOC’07, San Diego, California, 2007.
[17] S. N. Dorogovtsev, J.F.F. Mendes, and A.N. Samukhin. Structure of growing networks with preferential linking. Phys. Rev. Lett., 85: 4633, 2000.
[18] C. Godreche, H. Grandclaude, and J.M. Luck. Finite-time fluctuations in the degree statistics of growing networks. J. of Stat. Phys., 137:1117- 1146, 2009.
[19] Y.-H. Eom and S. Fortunato. Characterizing and Modeling Citation Dynamics. PLoS ONE, 6: e24926, 2011.
[20] A.-L. Barabási, H. Jeong, Z. Néda, E. Ravasz, A. Schubert, and T. Vicsek. Evolution of the social network of scientific collaborations. Physica A, 311: 590-614, 2002.
[21] R. Albert, and A.-L. Barabási. Topology of evolving networks: local events and universality. Phys. Rev. Lett., 85:5234-5237, 2000.
[22] G. Goshal, L. Chi, and A.-L Barabási. Uncovering the role of elementary processes in network evolution. Scientific Reports, 3:1-8, 2013.
[23] J.H. Schön, Ch. Kloc, R.C. Haddon, and B. Batlogg. A superconducting field-effect switch. Science, 288: 656–8. 2000.
[24] D. Agin. Junk Science: An Overdue Indictment of Government, Industry, and Faith Groups That Twist Science for Their Own Gain. Macmillan, New York, 2007.
[25] S. Saavedra, F. Reed-Tsochas, and B. Uzzi. Asymmetric disassembly and robustness in declining networks. PNAS, 105:16466–16471, 2008.
[26] F. Chung and L. Lu. Coupling on-line and off-line analyses for random power-law graphs. Int. Math., 1: 409-461, 2004.
[27] C. Cooper, A. Frieze, and J. Vera. Random deletion in a scalefree random graph process. Int. Math. 1, 463-483, 2004.
[28] S. N. Dorogovtsev and J. Mendes. Scaling behavior of developing and decaying networks. Europhys. Lett., 52: 33-39, 2000.
[29] C. Moore, G. Ghoshal, and M. E. J. Newman. Exact solutions for models of evolving networks with addition and deletion of nodes. Phys. Rev. E, 74: 036121, 2006.
[30] H. Bauke, C. Moore, J. Rouquier, and D. Sherrington. Topological phase transition in a network model with preferential attachment and node removal. The European Physical Journal B, 83: 519-524, 2011.
[31] M. Pascual and J. Dunne, (eds). Ecological Networks: Linking Structure to Dynamics in Food Webs. Oxford Univ Press, Oxford, 2005.
[32] R. Sole and J. Bascompte. Self-Organization in Complex Ecosystems. Princeton University Press, Princeton, 2006.
[33] U. T. Srinivasan, J. A. Dunne, J. Harte, and N. D. Martinez. Response of complex food webs to realistic extinction sequencesm. Ecology, 88:671– 682, 2007.
[34] M. Faloutsos, P. Faloutsos, and C. Faloutsos. On power-law relationships of the internet topology. ACM SIGCOMM Computer Communication Review, 29: 251-262, 1999.
[35] A. Broder, R. Kumar, F. Maghoul, P. Raghavan, S. Rajagopalan, R. Stata, and A. Tomkins. Graph structure in the web. Computer Networks, 33: 309-320, 2000.
[36] J. Leskovec, J. Kleinberg, and C. Faloutsos, Graph evolution: Densification and shrinking diameters. ACM TKDD07, ACM Transactions on Knowledge Discovery from Data, 1:1, 2007.
[37] H. Jeong, B. Tombor, R. Albert, Z. N. Oltvai, and A.-L. Barabási. The large-scale organization of metabolic networks. Nature, 407: 651–655, 2000.
[38] S. Dorogovtsev and J. Mendes. Effect of the accelerating growth of communications networks on their structure. Phys. Rev. E, 63: 025101(R), 2001.
[39] M. J. Gagen and J. S. Mattick. Accelerating, hyperaccelerating, and decelerating networks. Phys. Rev. E, 72: 016123, 2005.
[40] C. Cooper and P. Prałat. Scale-free graphs of increasing degree. Random Structures & Algorithms, 38: 396–421, 2011.
[41] N. Deo and A. Cami. Preferential deletion in dynamic models of web-like networks. Inf. Proc. Lett., 102: 156-162, 2007.
[42] S.N. Dorogovtsev and J.F.F. Mendes. Evolution of networks with aging of sites. Phys. Rev. E, 62:1842, 2000.
[43] A.N. Amaral, A. Scala, M. Barthélémy, and H.E. Stanley. Classes of small-world networks. Proc. National Academy of Sciences USA, 97: 11149, 2000.
[44] K. Klemm and V. M. Eguiluz. Highly clustered scale free networks. Phys. Rev. E, 65: 036123, 2002.
[45] X. Zhu, R. Wang, and J.-Y. Zhu. The effect of aging on network structure. Phys. Rev. E, 68: 056121, 2003.