Facebook engineering revealed the design behind its humoungous social graph that handles ” thousands of data types and handles over a billion read requests and millions of write requests every second.” Its called TAO (The Associated Objects), a homegrown project that Facebook had initiated in 2009 and is in production for a while. This project has done a massive lifting of load behind Facebook’s key value of connecting the objects and putting them on social graph. All the likes, comments and connections are captured via this TAO. Here is a paper explaining the more details about TAO
TAO graph captures users actions and relations. Relationships as “Liked By”, “Commeted by”, “Friends of” are typed edges (associations). The typed edges are grouped in association lists by their origin. For a reverse lookup, there is also inverse type association captured . Thus, there are two edges created for every association.
Here is an example f such relations (courtsey Facebook engineering post)
The TAO service runs across a collection of server clusters geographically distributed and organized logically as a tree. Separate clusters are used for storing objects and associations persistently, and for caching them in RAM and Flash memory.
Client requests are always sent to caching clusters running TAO servers. In addition to satisfying most read requests from a write-through cache, TAO servers orchestrate the execution of writes and maintain cache consistency among all TAO clusters. We continue to use MySQL to manage persistent storage for TAO objects and associations.
The data set managed by TAO is partitioned into hundreds of thousands of shards. All objects and associations in the same shard are stored persistently in the same MySQL database, and are cached on the same set of servers in each caching cluster
As per Facebook software engineer Mark Marchukov who posted the blog:”There are two tiers of caching clusters in each geographical region. Clients talk to the first tier, called followers. If a cache miss occurs on the follower, the follower attempts to fill its cache from a second tier, called a leader. Leaders talk directly to a MySQL cluster in that region. All TAO writes go through followers to leaders. Caches are updated as the reply to a successful write propagates back down the chain of clusters. Leaders are responsible for maintaining cache consistency within a region. They also act as secondary caches, with an option to cache objects and associations in Flash. Last but not least, they provide an additional safety net to protect the persistent store during planned or unplanned outages.”