computing and ranking top-k most central nodes in complex networks (top-k centralities)

posted Jun 17, 2014, 12:43 AM by erwan.lemerrer@technicolor.com   [ updated Feb 24, 2015, 2:44 AM ]
Centrality metrics have proven to be of a major interest when analyzing the structure of networks. The size of input networks keeps on increasing, reaching billions of nodes and edges; this calls for fast techniques.
For many applications, such analyses do not require the exact score of each node but aim at identifying the top nodes.

Here is a summarized state of art for top-k nodes computation or ranking:

State of Art (early 2015)
Paper Applicability Idea Guarantees
Fast and exact top-k algorithm for Pagerank
AAAI'13
 PagerankEstimates the lower and upper bounds of PageRank scores to reduce the computational cost. Instead of using the whole graph to compute PageRank scores, obtains subgraphs to compute the estimations by pruning unnecessary nodes and edges in the iterations.Exact computation.
O((n+m+log c log k) t) running time.
Efficient top-k closeness centrality search
ICDE'14
closeness
reuse a previous vertex computation while computing subsequent vertices
Empirical,
running time
Heuristical top-k: fast estimation of centralities in complex networks
IPL'14
expensive centralities
filter the graph through a cheap centrality (e.g. degree), then apply the costly target centrality over the remaining core
empirical, sorts top-k nodes, sqrt(m) complexity reduction of the target centrality
Fast approximation of betweenness centrality through sampling
WSDM'14
betweenness
random sampling of shortest paths
estimated betweenness for top-k nodes within  a multiplicative factor from real values, with controllable probability
Ranking of Closeness Centrality for Large-Scale Social Networks
FAW'08
closeness
compute SSSP for sample nodes, then exact average shortest-path distances for most central nodes
running time, under certain conditions

Online estimating the k central nodes of a network
NSW'11
degree/betweenness/closeness computation on sampled subgraphs, or degree as an alias to betweenness and closeness empirical

Comments/additions are welcome.
Comments