When Facebook unveiled its Graph Search feature earlier this year, it lacked the bite to make it into the powerful tool it was projected to be. Earlier this month, Facebook packed a punch with an update to Graph Search that enabled it to search through posts, comments, check-ins and status updates. Why did this feature not come in flanking Graph Search in January? Because Facebook needed to build a system that was robust enough to stand the hundreds of terabytes worth of data generated by one billion posts pumped into the website daily. Ashoat Tevosyan, an Engineer on the Search Quality and Ranking team wrote a pretty lengthy, in-depth blog post on Facebook enumerating how the team had to go about building the posts search feature. Two years in the making, Tevosyan writes about how the website collected the data, built the index, updated and served it. This is all followed by ranking the results.
So much data!
Posts, status updates, check-ins, tags on photos – Facebook collected them all and indexed them. The data is constantly updated as and when a post is edited or deleted. There are about 70 types of data that is sorted and indexed by Facebook. The index was built using an Hbase cluster, Hadoop jobs as well as Unicorn which happens to be Facebook’s search infrastructure. In order to serve the index to users, Facebook looked at a more efficient way of doing so without ending up using 700TB of RAM, which it was doing. The solution was to store majority of the index on solid-state flash memory. If you’re a fan of Graph Search, the post will serve to be quite an interesting read, if you’re interested in the back-end of things over at Facebook. After all, it is one of Facebook’s more successful products in recent times.
ReadMore:Android Games
No comments:
Post a Comment