The origins of exponential random graph models

The article An Exponential Family of Probability Distributions for Directed Graphs, published by Holland and Leinhardt (1981), set the foundation for the now known exponential random graph models (ERGM) or p* models, which model jointly the whole adjacency matrix (or graph) X. In this article they proposed an exponential family of probability distributions to model P(X=x), where x is a possible realisation of the random matrix X.

The article is mainly focused on directed graphs (although the theory can be extended to undirected graphs). Two main effects or patterns are considered in the article: Reciprocity, which relates to appearance of symmetric interactions (X_{ij}=1 \iff X_{ji}=1) (see nodes 3-5 of the Figure below).

Stochastic_block_model_directed

and, the Differential attractiveness of each node in the graph, which relates to the amount of interactions each node “receives” (in-degree) and the amount of interactions that each node “produces” (out-degree) (the Figure below illustrates the differential attractiveness of two groups of nodes).

Stochastic_block_model_directed2 The model of Holland and Leinhardt (1981), called p1 model, that considers jointly the reciprocity of the graph and the differential attractiveness of each node is:

p_1(x)=P(X=x) \propto e^{\rho m + \theta x_{**} + \sum_i \alpha_i x_{i*} + \sum_j \beta_j x_{*j}},

where \rho,\theta,\alpha_i,\beta_j are parameters, and \alpha_*=\beta_*=0 (identifying constrains).  \rho can be interpreted as the mean tendency of reciprocation\theta can be interpreted as the density (size) of the network, \alpha_i can be interpreted as as the productivity (out-degree) of a node, \beta_j can be interpreted as the attractiveness (in-degree) of a node.

The values m, x_{**}, x_{i*} and x_{*j} are: the number of reciprocated edges in the observed graph, the number of edges, the out-degree of node i and the in-degree of node j; respectively.

Taking D_{ij}=(X_{ij},X_{ji}), the model assumes that all D_{ij} with i<j are independent.


 

To better understand the model, let’s review its derivation:

Consider the pairs D_{ij}=(X_{i,j},X_{j,i}),\,i<j and describe the joint distribution of \{D_{ij}\}_{ij}, assuming all D_{ij} are statistically independent. This can be done by parameterizing the probabilities

P(D_{ij}=(1,1))=m_{ij} \text{ if } i<j,

P(D_{ij}=(1,0))=a_{ij} \text{ if } i\neq j,

P(D_{ij}=(0,0))=n_{ij} \text{ if } i<j,

where m_{ij}+a_{ij}+a_{ji}+n_{ij}=1,\, \forall \, i<j .

Hence leading

P(X=x)=\prod_{i<j} m_{ij}^{x_{ij}x_{ji}} \prod_{i\neq j}a_{ij}^{x_{ij}(1-x_{ji})} \prod_{i<j}n_{ij}^{(1-x_{ij})(1-x_{ji})}    =e^{\sum_{i<j} {x_{ij}x_{ji}} \rho_{ij} + \sum_{i\neq j}{x_{ij}} \theta_{ij}} \prod_{i<j}n_{ij},

where \rho_{ij}=log(m_{ij}n_{ij} / a_{ij}a_{ji}) for i<j, and \theta_{ij}=log(a_{ij}/n_{ij}) for i\neq j.

It can be seen that the parameters \rho_{ij} and \theta_{ij} can be interpreted as the reciprocity and differential attractiveness, respectively. With a bit of algebra we get:

exp(\rho_{ij})=[ P(X_{ij}=1|X_{ji}=1)/P(X_{ij}=1|X_{ji}=0) ]/[ P(X_{ij}=1|X_{ji}=0) / P(X_{ij}=0|X_{ji}=0) ],
and
exp(\theta_{ij})=P(X_{ij}=1|X_{ji}=0)/P(X_{ij}=0|X_{ji}=0).

Now, if we consider the following restrictions:

\rho_{ij}=\rho,\, \forall i<j, and \theta_{ij}=\theta+\alpha_i + \beta_j,\, \forall i\neq j where \alpha_*=\beta_*=0 .

With some algebra we get the proposed form of the model

p_1(x)=P(X=x) \propto e^{\rho m + \theta x_{**} + \sum_i \alpha_i x_{i*} + \sum_j \beta_j x_{*j}}.

 

 

Author