Abstract:
Betweenness Centrality measures, erstwhile popular amongst the sociologists and psychologists,
have seen wide and increasing applications across several disciplines of the late. It
is a measure that quantifies the importance of a vertex based on its frequency of occurrence
on shortest paths in a graph. In conjunction with the big data problems, there came the need
to analyze large complex networks. There are two reasons why the current state of the art
algorithms for exact computation of a node’s betweenness are not time efficient. Firstly,
this is because of the large size and the dynamic nature of networks. In large dynamic networks,
we have to recompute the centrality scores each time the network changes, which is
evidently expensive. Secondly, this is because of the global characteristics of betweenness
centrality. Unlike degree and closeness centralities, betweenness centrality computation of
a node is conjectured to be as expensive as computing it for all the nodes in any network.
Most of the algorithms which are used to find betweenness centrality assume the constancy
of the graph and hence, are not efficient for dynamic networks. It is sometimes necessary
to calculate betweenness centrality in a network at every stage of transition. With a
large network and the current algorithms in use, recalculation becomes difficult. We propose
a technique to update betweenness centrality of a graph when nodes are added or
deleted. A trivial approach is to recompute the betweenness scores of all the nodes using
a state of the art algorithm [18], but this process is highly inefficient in the terms of time.
Other known algorithms update the betweenness centrality after alteration of a link and assume
that a node alteration is a series of link alterations. The proposed algorithm does not
iterate over link alteration. Observed experimentally, for real graphs, our algorithm speeds
up the calculation of betweenness centrality from 7 to 412 times in comparison to the best
known techniques.
Next, we motivate to efficiently estimate a node’s betweenness without computing betweenness of all nodes. We propose a non-uniform node sampling based method to estimate
the betweenness score of a node. In the uniform node sampling based method, all nodes
are equally probable to be sampled. Here, we try assigning larger probability values to the
vertices contributing more to the betweenness of a given node v and smaller to those that
contribute less. An analysis of the random Erdos-Rényi graphs is performed to establish ˝
the relation between probabilities and distance. We apply our model to estimate a node’s
betweenness in several synthetic and real-world graphs. We compare our method with the
available techniques in the literature and show that our method fares several times better
than the currently known techniques.
Further, we define a new problem called centrality-ordering. The centrality-ordering
problem is to rank a given subset of k vertices in a graph based on their centrality score.
We explain the problem using eccentricity-ordering and then focus towards betweennessordering.
We present a real world example that better motivates the betweenness-ordering
problem. Since, computing the betweenness centrality of one node is equivalent to computing
the betweenness centrality of all nodes according to the currently known deterministic
algorithms, we are motivated to address the problem of betweenness ordering of k nodes,
where k is smaller (nearly constant) than the total number of nodes. We use the estimation
algorithm proposed in former work to find the ordering. Experimental results reveal that
ordering efficiency is very high even when the algorithm runs in linear time in the number
of edges. Our algorithm outperforms all the considered sampling-based algorithms for
ordering by a big margin.