[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

re: linux clusters at google etc talk

To: cse494-s04@parichaalak.eas.asu.edu
Subject: re: linux clusters at google etc talk
From: Subbarao Kambhampati <rao@asu.edu>
Date: Wed, 25 Feb 2004 07:31:30 -0700

Here are some random notes on the talk

--He tries a sort of funny "flow" based explanation of page rank (which doesnt quite work and he has to revert to the random surfer model anyway).

--The main talk is about how they use cheap PCs in clusters for hardware support. The technical part that you can take away is that due to the read-only nature, and multiple independent queries of websearch engines, it is quite easy to parallelize the heck out of the problem. An incoming query is switched to one of N different machine clusters using a fast switcher/load balancer, and is then taken over by that cluster. There are nice photos of these machine clusters.

--They go with cheap PCs as their workhorses--and deal with machine failure by a lots and lots of replication (of the index and document servers). Much of the talk is a sort of justification of why this works out well for Google $ wise.

--Couple cute jargon words: Sharding--spanning a large file over multiple systems

--Some interesting observations on scale--if you have a disk that is rated for 250,000 hours mean time between failures, and have a 50,000 disks you expect a disk failure every 5 hours (this sort of scale arguments also come out in the Haveliwala global clustering paper).

--He gives some "Google" perspectives on what is good research and what is bad research on search engines. User modeling and adaptive search engines are considred good, while semantic web, deep web and P2P are considered bad (the arguments are so-so).

Rao

Prev by Date: linux clusters at google etc talk
Next by Date: Washington Post article on Beyond Google (2/15)
Previous by thread: linux clusters at google etc talk
Next by thread: Washington Post article on Beyond Google (2/15)
Index(es):
- Date
- Thread