Tricks for Distributed System Debugging and Diagnosis

Philip Zeyliger (Cloudera)
Hadoop in Practice Great America Ballroom K
Average rating: ****.
(4.50, 2 ratings)

Distributed systems make for tricky diagnosis problems. Which component is at fault? Is it the network, the machine, the process, or, even worse, some emergent complex behavior? We’ll discuss in depth the tools most commonly used to figure out these thorny issues:

Logs. The perennial go to. We’ll point out unexpected ways to increase logging verbosity at runtime and how to use time-based search across many machines to look at inconsistencies.

Web-based consoles. Must-haves for the modern distributed system, debug servlets can provide surprising insights. We’ll point out what information Hadoop exposes and present easily embeddable open source consoles.

Linux has many goodies in /proc, as well as several tools to look at performance (perf, top). We’ll cover old standbys like lsof and tcpdump, too.

Tracing, when it’s available, can be an incredible tool. HBase recently introduced some, and several other libraries (e.g., Zipkin) have emerged.

Since Hadoop runs on the JVM, we’ll discuss unusual things you can do there, like enabling verbose GC after the fact, poor man’s profiling, and extracting JMX metrics from even the most recalcitrant of processes.

Finally, we’ll bring it all together by doing some simple stats to find and visualize outliers.

The talk will be illustrated by examples from open source systems (especially Hadoop).

Photo of Philip Zeyliger

Philip Zeyliger

Cloudera

At Cloudera, Philip Zeyliger started and continues to work on the Cloudera Manager product, and is a committer on the Apache Avro project. Philip came to Cloudera from Google, where he worked on Megastore, and, before that, he worked in finance, at D.E. Shaw. Philip holds a bachelor’s degree in mathematics from Harvard University.

Comments on this page are now closed.

Comments

Picture of Philip Zeyliger
Philip Zeyliger
02/28/2013 7:30am PST

I’ve posted the slides at http://omel.ette.org/blog/2013/02/27/strata-slides/

Shantanu Bhattacharyya
02/27/2013 10:24pm PST

I really enjoyed the talk. I would love to snag the slides if they’re available.

Sponsors

Sponsorship Opportunities

For information on exhibition and sponsorship opportunities at the conference, contact Susan Stewart at sstewart@oreilly.com

Media Partner Opportunities

For information on trade opportunities with O'Reilly conferences contact Kathy Yu at mediapartners
@oreilly.com

Press and Media

For media-related inquiries, contact Maureen Jennings at maureen@oreilly.com

Contact Us

View a complete list of Strata contacts