[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Scale-free networks and small world phenomena (and ZIff's law)



Now that I wound up spending a significant time at the beginning of the class talking about the connectivity of the webgraph, I should
mention to you two important things:


Scale-free networks and Small-world phenomena

Small-world phenomena are essentially the phenomena underlying kevin bacon game and six-degrees of separation (the latter says that given any two random people po and pn in a human population, we can find a sequence of six people p1...pk for small number k (often around 6) such that the p0 knows p1, who know p2, who knows p3 ..etc. and pk knows pn).

Small-world phenomena happen to be an innate property of random networks (i.e., networks where the connectivity of nodes is random) that should be no-more surprising than the fact that given a room with about 20 people, there is more than 50% chance that at least two of them will have the common-birthday (and the probability becomes more than .99 if there are more than 60 people in the room ---see http://en.wikipedia.org/wiki/Birthday_paradox )


The fact that the probability of two randomly selected web pages are connected is around .35 is _related to_ the small-world phenomenon.



----------


At first, when people started thinking of web-graphs, they assumed that these will be random networks. However, they quickly found that in reality, the web-graphs are not randomly connected at all--and that they tend to have relatively few nodes of very high connectivity (e.g. your yahoos and
cnns). Such networks are termed "Scale-free networks" (see http://en.wikipedia.org/wiki/Scale-free_network ; also http://www.computerworld.com/networkingtopics/networking/story/0,10801,75539,00.html for a short description ).


Turns out that small-world phenomena can occur in both scale-free and purely random networks, but scale-free networks tend to be more brittle
than random ones (take out a couple of highly connected nodes and you can drastically reduce the connectivity of the network!)


Now here is the interesting part--for a long time sociologists, epedemiologists etc would consider human connectivity networks to be essentially modelable as random networks. Now, more and more they realize that scale-free networks are much better models for these (as an example, the best way of stopping the spread of a communicable epidemic may be to "neutralize" some of the highly connected individuals in the group (for a more concrete image, replace "communicable epidemic" with a sexually transmitted disease, and "highly connected with" "highly sexually connected"))

--------------

The phenomena of a few nodes dominating connectivity in scale-free networks is related to the phenomena of the popularity of the keyword searches. If you were to rank the specific keyword searches submitted to Google or any other search engine, you will find that the cardinal popularity falls pretty drastically (e.g. number 1 most popular keyword may be getting a billion hits, while the number 10 most popular may be getting less than a 100 hits). This type of "popular boys/girls get all the dates" phenomenon are modeled in terms of Zipfian distributions
(where the nth most popular keyword gets hits that are inversely proportional to n).
http://en.wikipedia.org/wiki/Zipf's_law


All of these have important ramifications to how web and web search work...


nifty, eh?


Rao