Archive

Posts Tagged ‘Twitter’

Twitter Engineering: Introducing FlockDB

October 29, 2011 Leave a comment

LESSONS LEARNED

Some helpful patterns fell out of our experience, even though they weren’t goals originally:

USE AGGRESSIVE TIMEOUTS TO CUT OFF THE LONG TAIL.

You can’t ever shake out all the unfairness in the system, so some requests will take an unreasonably long time to finish — way over the 99.9th percentile. If there are multiple stateless app servers, you can just cut a client loose when it has passed a “reasonable” amount of time, and let it try its luck with a different app server.

MAKE EVERY CASE AN ERROR CASE.

Or, to put it another way, use the same code path for errors as you use in normal operation. Don’t create rarely-tested modules that only kick in during emergencies, when you’re least likely to feel like trying new things. We queue all write operations locally (using Kestrel as a library), and any that fail are thrown into a separate error queue. This error queue is periodically flushed back into the write queue, so that retries use the same code path as the initial attempt.

DO NOTHING AUTOMATICALLY AT FIRST.

Provide lots of gauges and levers, and automate with scripts once patterns emerge. FlockDB measures the latency distribution of each query type across each service (MySQL, Kestrel, Thrift) so we can tune timeouts, and reports counts of each operation so we can see when a client library suddenly doubles its query load (or we need to add more hardware). Write operations that cycle through the error queue too many times are dumped into a log for manual inspection. If it turns out to be a bug, we can fix it, and re-inject the job. If it’s a client error, we have a good bug report.

via Twitter Engineering: Introducing FlockDB.

Advertisements

Why the #AskObama Tweet was Garbled on Screen

October 9, 2011 Leave a comment

the garbled tweet:

the explanation:

FWIW, this was an intense project to pull off. 1000’s of tweets per minute from Twitter, 8000 requests per second on http://askobama.twitter.com (where the same tweet was also delivered by us and rendered correctly).

We’re not lazy or sloppy… It basically boiled down to one server sent down the right header…the production one didn’t.

Unicode issues are sorta in the class of “gotcha” issue. They happen, you go “oh shit” and fix them right away. Our “oh shit” moment just happened to come at the most intense possible moment….in front of the president, with so many watching.

Sources: