Nowadays network comparison is becoming increasingly relevant. Why is this? Mainly because it is a desirable way to compare complex systems that can often be represented as networks.
Network comparison aims to account for properties that are generated by the simultaneous interaction of all units rather than the properties of each single individual. Here are some cases where network comparison could be helpful:
– Showing and highlighting “significant” changes on network evolution. A particular characteristic of interest could be the speed with which information flows.
– Quantifying how “close” two networks are. Even when the networks have a different different number of nodes and edges, or in the case of spatially embedded networks, different scale.
As an example, look at the following two networks. Are the structures of these two road networks similar?
Or what about the structure of these two other networks?
One of the difficulties in comparing networks is that there is no clear way to compare networks as complete whole entities. Network comparison methods only compare certain attributes of the network, among these: density of edges, global clustering coefficient, degree distribution, counts of smaller graphs embedded in the network and others. Here are some ideas and methods that have been used as ways to compare networks.
- Networks can be compared by their global properties and summary statistics, like network density, degree distribution, transitivity, average shortest path length and others. Usually a statistic combining the differences of several global properties was used.
- Another way to compare networks is based on the fit of a statistical network model (eg. ERGM) to each network. Then, the coefficients of the fitted models could be compared. Note that in this case, a good fit of the models would be required.
- Statistics directly built for network comparison via subgraph counts. These statistics do not make any assumptions of the network generation process. For example, Netdis and GCD are two network comparison statistics that try to measure the structural difference between networks. These network comparison measures are based on counts of small subgraphs (3-5 nodes), like triangles, 2-stars, 3-stars, squares, cliques and others (see Figure below). These network comparison statistics create frequency vectors of subgraphs and then compare these frequencies between the networks to obtain an idea of the similarity of the networks relative to their subgraph counts.
- Lastly, another way to indirectly compare networks is via network alignment methods. The objective of these methods is to create a “mapping”, f, from the node set of one network to the node set of another. The following Figure shows two networks, light and dark blue. An alignment of the two networks is shown in red.One of the objectives of an alignment is to maximise the number of conserved interactions, that is, if (u,v) is an edge in the first network and f is an alignment, then an edge (u,v) is conserved if (f(u),f(v)) is an edge in the second network. It can be noted that the edge demarcated by the green oval is a non-conserved interaction.
NETAL is a commonly used network alignment method, although there are several, more recent, network alignment methodologies.
In the end, despite the variety of ways to compare networks, saying that two networks are similar, or different, is not that easy, as all methods face their own particular challenges, like networks that come from the same model but have different node size.