Параметры публикации
Тип публикации: 
Статья в журнале/сборнике
Analysis of Clusters in Network Graphs for Personalized Web Search
Наименование источника: 
Обозначение и номер тома: 
Volume 50, Issue 1
Год издания: 
PageRank is one of the most popular measures of the importance of Web pages. The dual nature of PageRank is not yet completely understood. From its definition the PageRank of a Web page (node) is the probability to find a random walk at this node when the process has reached the steady state. From another side, PageRank of a randomly selected page is a random %hidden Markov process due to the random number of its in- and out-degrees. Considering the stochastic nature of PageRank we aim to study its extremal properties. To this end we estimate the extremal index of PageRank that represents the dependence measure of extremes and plays a fundamental role in the theory of extreme values. Using the representation of PageRank as a weighted branching process introduced first in Jelenkovic and Olvera-Cravioto (2010), we propose a nonparametric estimation of the extremal index of PageRank by samples of moderate sizes. It is based on the representation of the reciprocal of the extremal index by a mean cluster size. The cluster implies a block of data with at least one exceedance over a sufficiently high threshold. As data blocks it is proposed to consider generations of successors of a root node of PageRank branching process. Among practical advantages the extremal index determines the mean first hitting time to reach an influential node with a large PageRank. As an alternative to PageRank, we consider a Max-linear model and compare its extremal index and distribution with those ones of PageRank.
Библиографическая ссылка: 
Маркович Н.М. Analysis of Clusters in Network Graphs for Personalized Web Search // IFAC-PapersOnLine. 2017. Volume 50, Issue 1. С. 5345-5350.