Kernel Methods are a Hot Topic in Network Feature Analysis

The kernel trick is a well known method in machine learning for producing a real-valued measure of similarity between data points in any number of settings. Kernel methods for network analysis provide a way of assigning real values to vertices of the graph. These values may correspond to similarity across any number of graphical properties such as the neighbours they share, or more dynamic context, the influence that change in the state of one vertex might have on another.

By using the kernel trick it is possible to approximate the distribution of features on the vertices of a graph in a way that respects the graphical relationships between vertices. Kernel based methods have long been used, for instance in inferring protein function from other proteins within Protein Interaction Networks (PINs).

The general form of such methods is

V \overset{K}{\rightarrow}\mathbb{R}^+ \overset{f}{\rightarrow}F,

where V are the vertices of the graph, K is the graph kernel, \mathbb{R}^+ are the non-negative reals, F is the vertex feature of interest (e.g. protein annotations) and f is the function that approximates the relationship between the vertices and their features.

The Graph Laplacian

The most well known basis for the graph kernel K and the one that has received considerable development in biological networks analysis is the so-called graph Laplacian matrix L. The Laplacian characterises each vertex i of the graph in terms of its adjacency matrix A, and degree diagonal matrix D=\{d_{ii}\}_{i\in V}, L =D-A. Informally L characterises the flow into and out of each vertex i where the i^{th} row contains firstly the flow into i on the diagonal and secondly the flow to other vertices on the off diagonals L_{ij}.

The Diffusion Kernel

Because of its relationship the Laplacian, a Laplacian-based graph kernel characterises the diffusion or flow of information between vertices and is related to other measures of vertex influence such as random walk measures (a property inherited from the Laplacian matrix). Diffusion or heat kernels as these matrices are called, have been used in a variety of contexts, from analysing protein function to comparing fMRI results. The diffusion kernel K can be written as

K= Ue^{-t\Lambda}U^T,

where U is the eigenvector matrix of L and \Lambda is the diagonal matrix of eigenvalues, with scaling parameter t (where increasing t means increasing the scale or extent of the diffusion process). Approximate feature maps f that use diffusion kernels optimise both for good point approximations at a vertex i and also for local agreement of f(i) with its most influential (read nearby and strongly connected) neighbours.

Relationship with the Heat Equation and Information Diffusion

The diffusion kernel on graphs is actually part of a broader class of kernels which all relate to the heat equation, a PDE describing the physical propagation of heat along a wire. This diffusion is usually thought to occur over a period of time as well as space (hence the t in the above equation).

This provides an appealing analogy in which the graphical diffusion kernel really encodes the propensity for propagation of information (heat) between vertices in the graph. This is a desirable property for instance in protein annotation inference where proteins between which information might propagate most easily are also thought to influence each-other biologically. The implication being that increased kernel similarity may indicate increased probability of shared protein function and thus, for instance, shared functional annotations and biochemical properties.

Deeper Analogies with the Heat Equation: Heat Spectral Wavelets

The solution of the physical heat equation can be decomposed into a number of frequency components or spectra in the Fourier domain. Each of these frequencies correspond to a spatial resolution of heat propagation. These too have a graphical analogue in the form of the heat spectral wavelet. These wavelets function just like the Fourier series they are based on, providing a decomposition of the diffusion kernel K in terms of wavelets \Psi_i. These wavelets are the projection of K onto the vertex i. Thus

\Psi_i = K\delta_i,

where \delta_i selects out the i^{th} column of K. By scaling t it is then possible to explore the influence on a vertex from each other vertex and across multiple scales.

Applications: Topological and Time Series Analysis

In the same way that K maps V into \mathbb{R}, heat spectral wavelets are vectors that map each vertex into a |V| dimensional space that provides multiscale information on the position of vertex i in the graph. This has been used to construct higher order spatial embeddings of graphs, which has interesting applications e.g. in biological network comparison using Topological Data Analysis (TDA) and other geometric analysis methods. More recently, diffusion kernels have been generalised to the time domain where each vertex may have a time series associated with it (e.g. a series of expression level measurements for each protein in the PIN).

Further Reading

Kernel Methods and ‘The Kernel Trick’
Kernel methods in machine learning – kernel machines (2008)
Diffusion kernels on graphs and other discrete Structures

Kernels for Inference of Protein Function & fMRI Analysis
Classifying HCP task-fMRI networks using heat kernels (2016)
Gene function prediction from functional association networks using kernel partial least squares regression (2015)
Diffusion kernel-based logistic regression models for protein function prediction (2006)

Heat Spectral Wavelets and Time Series Analysis
Spectral graph wavelets for structural role similarity in networks (2018)
Tracking network dynamics: A survey using graph distances (2018)
Kernel based reconstruction of space-time functions on dynamic graphs (2017)

Author