{"id":4515,"date":"2019-02-18T16:34:58","date_gmt":"2019-02-18T16:34:58","guid":{"rendered":"https:\/\/www.blopig.com\/blog\/?p=4515"},"modified":"2019-02-19T13:19:36","modified_gmt":"2019-02-19T13:19:36","slug":"kernel-methods-are-a-hot-topic-in-network-feature-analysis","status":"publish","type":"post","link":"https:\/\/www.blopig.com\/blog\/2019\/02\/kernel-methods-are-a-hot-topic-in-network-feature-analysis\/","title":{"rendered":"Kernel Methods are a Hot Topic in Network Feature Analysis"},"content":{"rendered":"\n<p class=\"wp-block-paragraph\">The kernel trick is a well known method in machine learning for producing a real-valued measure of similarity between data points in any number of settings.  Kernel methods for network analysis provide a way of assigning real values to vertices of the graph. These values may correspond to similarity across any number of graphical properties such as the neighbours they share, or more dynamic context, the influence that change in the state of one vertex might have on another. <\/p>\n\n\n\n<p class=\"wp-block-paragraph\">By using the kernel trick it is possible to approximate the distribution of features on the vertices of a graph in a way that respects the graphical relationships between vertices. Kernel based methods have long been used, for instance in inferring protein function from other proteins within Protein Interaction Networks (PINs).<\/p>\n\n\n\n<!--more Continue Reading-->\n\n\n\n<p class=\"wp-block-paragraph\">The general form of such methods is<\/p>\n\n\n\n<p class=\"wp-block-paragraph\" style=\"text-align:center\"><img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=V+%5Coverset%7BK%7D%7B%5Crightarrow%7D%5Cmathbb%7BR%7D%5E%2B+%5Coverset%7Bf%7D%7B%5Crightarrow%7DF%2C&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"V &#92;overset{K}{&#92;rightarrow}&#92;mathbb{R}^+ &#92;overset{f}{&#92;rightarrow}F,\" class=\"latex\" \/><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">where <img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=V&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"V\" class=\"latex\" \/> are the vertices of the graph, <img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=K&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"K\" class=\"latex\" \/> is the graph kernel, <img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=%5Cmathbb%7BR%7D%5E%2B&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"&#92;mathbb{R}^+\" class=\"latex\" \/> are the non-negative reals, <img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=F&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"F\" class=\"latex\" \/> is the vertex feature of interest (e.g. protein annotations) and <img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=f&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"f\" class=\"latex\" \/> is the function that approximates the relationship between the vertices and their features.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>The&nbsp;Graph&nbsp;Laplacian<\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The most well known basis for the graph kernel <img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=K&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"K\" class=\"latex\" \/> and the one that has received considerable development in biological networks analysis is the so-called graph Laplacian matrix <img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=L&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"L\" class=\"latex\" \/>. The Laplacian characterises each vertex <img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=i&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"i\" class=\"latex\" \/> of the graph in terms of its adjacency matrix <img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=A&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"A\" class=\"latex\" \/>, and degree diagonal matrix <img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=D%3D%5C%7Bd_%7Bii%7D%5C%7D_%7Bi%5Cin+V%7D&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"D=&#92;{d_{ii}&#92;}_{i&#92;in V}\" class=\"latex\" \/>, <img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=L+%3DD-A&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"L =D-A\" class=\"latex\" \/>.  Informally <img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=L&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"L\" class=\"latex\" \/> characterises the flow into and out of each vertex <img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=i&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"i\" class=\"latex\" \/> where the <img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=i%5E%7Bth%7D&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"i^{th}\" class=\"latex\" \/> row contains firstly the flow into <img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=i&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"i\" class=\"latex\" \/> on the diagonal and secondly the flow to other vertices on the off diagonals <img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=L_%7Bij%7D&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"L_{ij}\" class=\"latex\" \/>.  <\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>The&nbsp;Diffusion&nbsp;Kernel<\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Because of its relationship the Laplacian, a Laplacian-based graph kernel characterises the diffusion or flow of information between vertices and is related to other measures of vertex influence such as random walk measures (a property inherited from the Laplacian matrix).  Diffusion or heat kernels as these matrices are called, have been used in a variety of contexts, from analysing protein function to comparing fMRI results.  The diffusion kernel K can be written as<\/p>\n\n\n\n<p class=\"wp-block-paragraph\" style=\"text-align:center\"><img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=K%3D+Ue%5E%7B-t%5CLambda%7DU%5ET%2C&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"K= Ue^{-t&#92;Lambda}U^T,\" class=\"latex\" \/><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">where <img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=U&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"U\" class=\"latex\" \/> is the eigenvector matrix of <img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=L&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"L\" class=\"latex\" \/> and <img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=%5CLambda&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"&#92;Lambda\" class=\"latex\" \/> is the diagonal matrix of eigenvalues, with scaling parameter <img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=t&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"t\" class=\"latex\" \/> (where increasing <img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=t&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"t\" class=\"latex\" \/> means increasing the scale or extent of the diffusion process). Approximate feature maps <img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=f&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"f\" class=\"latex\" \/> that use diffusion kernels optimise both for good point approximations at a vertex <img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=i&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"i\" class=\"latex\" \/> and also for local agreement of <img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=f%28i%29&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"f(i)\" class=\"latex\" \/> with its most influential (read nearby and strongly connected) neighbours. <\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Relationship with the Heat Equation and Information Diffusion<\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The diffusion kernel on graphs is actually part of a broader class of kernels which all relate to the heat equation, a PDE describing the physical propagation of heat along a wire. This diffusion is usually thought to occur over a period of time as well as space (hence the <img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=t&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"t\" class=\"latex\" \/> in the above equation). <\/p>\n\n\n\n<p class=\"wp-block-paragraph\">This provides an appealing analogy in which the graphical diffusion kernel really encodes the propensity for propagation of information (heat) between vertices in the graph.  This is a desirable property for instance in protein annotation inference where proteins between which information might propagate most easily are also thought to influence each-other biologically. The implication being that increased kernel similarity may indicate increased probability of shared protein function and thus, for instance, shared functional annotations and biochemical properties.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Deeper Analogies with the Heat Equation:&nbsp;Heat&nbsp;Spectral&nbsp;Wavelets<\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The solution of the physical heat equation can be decomposed into a number of frequency components or spectra in the Fourier domain. Each of these frequencies correspond to a spatial resolution of heat propagation.  These too have a graphical analogue in the form of the heat spectral wavelet.  These wavelets function just like the Fourier series they are based on, providing a decomposition of the diffusion kernel <img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=K&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"K\" class=\"latex\" \/> in terms of wavelets <img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=%5CPsi_i&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"&#92;Psi_i\" class=\"latex\" \/>. These wavelets are the projection of <img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=K&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"K\" class=\"latex\" \/> onto the vertex <img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=i&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"i\" class=\"latex\" \/>. Thus <\/p>\n\n\n\n<p class=\"wp-block-paragraph\" style=\"text-align:center\"><img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=%5CPsi_i+%3D+K%5Cdelta_i%2C&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"&#92;Psi_i = K&#92;delta_i,\" class=\"latex\" \/><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">where <img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=%5Cdelta_i&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"&#92;delta_i\" class=\"latex\" \/> selects out the <img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=i%5E%7Bth%7D&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"i^{th}\" class=\"latex\" \/> column of <img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=K&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"K\" class=\"latex\" \/>. By scaling <img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=t&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"t\" class=\"latex\" \/> it is then possible to explore the influence on a vertex from each other vertex and across multiple scales.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Applications: Topological and Time Series Analysis<\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">In the same way that <img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=K&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"K\" class=\"latex\" \/> maps <img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=V&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"V\" class=\"latex\" \/> into <img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=%5Cmathbb%7BR%7D&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"&#92;mathbb{R}\" class=\"latex\" \/>, heat spectral wavelets are vectors that map each vertex into a <img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=%7CV%7C&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"|V|\" class=\"latex\" \/> dimensional space that provides multiscale information on the position of vertex <img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/s0.wp.com\/latex.php?latex=i&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002\" alt=\"i\" class=\"latex\" \/> in the graph. This has been used to construct higher order spatial embeddings of graphs, which has interesting applications e.g. in biological network comparison using Topological Data Analysis (TDA) and other geometric analysis methods.   More recently, diffusion kernels have been generalised to the time domain where each vertex may have a time series associated with it (e.g. a series of expression level measurements for each protein in the PIN). <\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Further Reading<\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Kernel Methods and &#8216;The Kernel Trick&#8217;<br><a href=\"http:\/\/www.kernel-machines.org\/publications\/pdfs\/0701907.pdf\">Kernel methods in machine learning &#8211; kernel machines (2008)<\/a><br><a href=\"http:\/\/people.cs.uchicago.edu\/~risi\/papers\/diffusion-kernels.pdf\">Diffusion kernels on graphs and other discrete Structures<\/a><br><br>Kernels for Inference of Protein Function &amp; fMRI Analysis<br><a href=\"https:\/\/ieeexplore.ieee.org\/abstract\/document\/7552339\/\">Classifying HCP task-fMRI networks&nbsp;using&nbsp;heat kernels&nbsp;(2016)<\/a><br><a href=\"https:\/\/journals.plos.org\/plosone\/article?id=10.1371\/journal.pone.0134668\">Gene function prediction from functional association networks using kernel partial least squares regression&nbsp;(2015)<\/a><br><a href=\"https:\/\/www.liebertpub.com\/doi\/abs\/10.1089\/omi.2006.10.40\">Diffusion kernel-based logistic regression models for protein function prediction&nbsp;(2006)<\/a><br><br>Heat Spectral Wavelets and Time Series Analysis<br><a href=\"https:\/\/openreview.net\/forum?id=rJR2ylbRb\">Spectral graph wavelets for structural role similarity in networks (2018)<\/a><br><a href=\"https:\/\/projecteuclid.org\/euclid.aoas\/1532743483\">Tracking network dynamics: A survey using graph distances (2018)<\/a><br><a href=\"https:\/\/ieeexplore.ieee.org\/abstract\/document\/7979500\">Kernel based reconstruction of space-time functions on dynamic graphs&nbsp;(2017)<\/a><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><\/p>\n","protected":false},"excerpt":{"rendered":"<p>The kernel trick is a well known method in machine learning for producing a real-valued measure of similarity between data points in any number of settings. Kernel methods for network analysis provide a way of assigning real values to vertices of the graph. These values may correspond to similarity across any number of graphical properties [&hellip;]<\/p>\n","protected":false},"author":51,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"nf_dc_page":"","wikipediapreview_detectlinks":true,"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"ngg_post_thumbnail":0,"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[138,189,188],"tags":[214,172,164],"ppma_author":[480],"class_list":["post-4515","post","type-post","status-publish","format-standard","hentry","category-journal-club","category-machine-learning","category-networks","tag-function-prediction","tag-machine-learning","tag-networks"],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"authors":[{"term_id":480,"user_id":51,"is_guest":0,"slug":"jamesw","display_name":"James Wilsenach","avatar_url":"https:\/\/secure.gravatar.com\/avatar\/2328e847cb0e853c3ffb6e135d15dc323b967a2e47d6882eadb744af62eaeb7e?s=96&d=mm&r=g","0":null,"1":"","2":"","3":"","4":"","5":"","6":"","7":"","8":""}],"_links":{"self":[{"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/posts\/4515","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/users\/51"}],"replies":[{"embeddable":true,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/comments?post=4515"}],"version-history":[{"count":4,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/posts\/4515\/revisions"}],"predecessor-version":[{"id":4531,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/posts\/4515\/revisions\/4531"}],"wp:attachment":[{"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/media?parent=4515"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/categories?post=4515"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/tags?post=4515"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/ppma_author?post=4515"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}