Why not LDAP?

March 13, 2009 · Posted in General, LDAP 

I just read Evan Weaver’s blog post about Improving Running Components at Twitter and I just have to ask: “Why not LDAP?”

Let me start by saying that I do not work at twitter, so I really have no idea how they put things together there. Reading a bunch of articles on the web is not going to give me the full picture, either, so I might just be completely off-base with this post. However, I do know LDAP (very well actually), and it seems to me like a lot of the performance problems twitter is constantly trying to solve could be best handled using LDAP.

LDAP not just a fancy white page system. If it was, I don’t think Verison would stake the business of its entire 75 million wireless subscribers on it. LDAP is designed to scale, and it seems perfect for what twitter is doing (at least for 1/2 of what twitter is doing, as messaging is probably appropriate for the other 1/2).

So, from a “backseat driver” perspective, I would suggest that twitter consider doing the following:

  1. Put all users in LDAP. So, this means that I have one LDAP record like, uid=lucasrockwell,ou=twetters,dc=twitter,dc=com
  2. Put the people each user is following into groups — one group per user. So, now I have a group: uid=lucasrockwell,ou=groups,dc=twitter,dc=com, which is made up of hundreds to thousands of “uniqueMember” attributes, each pointing to the DN of another user.
  3. Do queries against the groups to figure out who I am following, and who is following me.
  4. Break out the ou=groups and ou=tweeters containers into smaller containers based on username if you have too many to fit on one server. (LDAP was designed for this.)
  5. Use LDAP proxy servers to figure out where the user is located, and send the traffic directly to that server (or cluster of servers).
  6. Of course, run it all in memory. (Again, LDAP was designed to do this. Or, should I say, modern LDAP servers were designed to do this.)

Of course, this does not address the actual tweeting, but for that, I would think they would want to use a messaging system. Tweets do not need go in LDAP. They could, but that is a lot of writing to a system that is designed for fast reads. But, given the flexibility of LDAP, its ability to do fractional replication, multi-master replication, and logical separation of data into different physical servers, all while making it look like one monolithic system from the outside, I think they could probably solve most of their design needs with LDAP.

If any twitter people read this: If you have looked at LDAP and it was deemed not appropriate, I would love to know why.

Comments

Viewing 10 Comments

 
close Reblog this comment
blog comments powered by Disqus