Tricks for Distributed System Debugging and Diagnosis

Philip Zeyliger (Cloudera)
Hadoop in Practice Great America Ballroom K
Average rating: ****.
(4.50, 2 ratings)

Distributed systems make for tricky diagnosis problems. Which component is at fault? Is it the network, the machine, the process, or, even worse, some emergent complex behavior? We’ll discuss in depth the tools most commonly used to figure out these thorny issues:

Logs. The perennial go to. We’ll point out unexpected ways to increase logging verbosity at runtime and how to use time-based search across many machines to look at inconsistencies.

Web-based consoles. Must-haves for the modern distributed system, debug servlets can provide surprising insights. We’ll point out what information Hadoop exposes and present easily embeddable open source consoles.

Linux has many goodies in /proc, as well as several tools to look at performance (perf, top). We’ll cover old standbys like lsof and tcpdump, too.

Tracing, when it’s available, can be an incredible tool. HBase recently introduced some, and several other libraries (e.g., Zipkin) have emerged.

Since Hadoop runs on the JVM, we’ll discuss unusual things you can do there, like enabling verbose GC after the fact, poor man’s profiling, and extracting JMX metrics from even the most recalcitrant of processes.

Finally, we’ll bring it all together by doing some simple stats to find and visualize outliers.

The talk will be illustrated by examples from open source systems (especially Hadoop).

Photo of Philip Zeyliger

Philip Zeyliger

Cloudera

At Cloudera, Philip Zeyliger started and continues to work on the Cloudera Manager product, and is a committer on the Apache Avro project. Philip came to Cloudera from Google, where he worked on Megastore, and, before that, he worked in finance, at D.E. Shaw. Philip holds a bachelor’s degree in mathematics from Harvard University.

Leave a Comment or Question

Help us make this conference the best it can be for you. Have questions you'd like this speaker to address? Suggestions for issues that deserve extra attention? Feedback that you'd like to share with the speaker and other attendees?

Join the conversation here (requires login)

Comments

Picture of Philip Zeyliger
02/28/2013 7:30am PST

I’ve posted the slides at http://omel.ette.org/blog/2013/02/27/strata-slides/

02/27/2013 10:24pm PST

I really enjoyed the talk. I would love to snag the slides if they’re available.

Sponsors

Sponsorship Opportunities

For information on exhibition and sponsorship opportunities at the conference, contact Susan Stewart at sstewart@oreilly.com

Media Partner Opportunities

For information on trade opportunities with O'Reilly conferences contact Kathy Yu at mediapartners
@oreilly.com

Press and Media

For media-related inquiries, contact Maureen Jennings at maureen@oreilly.com

Contact Us

View a complete list of Strata contacts