ZendCon Session Notes - Scalability and Performance Best Practices: dshadow

dshadow

ZendCon Session Notes - Scalability and Performance Best Practices

Oct 31, 2006 19:49

Presented by George Schlossnagle, OmniTI Computer Consulting
Longer version of talk: http://omniti.com/~george/talks/

George started out by reminding everyone that scalability and performance are different, if linked, concepts. Scalability is the ability to gracefully handle additional traffic or load while maintaining service quality. Performance, on the other hand, is the ability to execute a single task quickly. One can be performant without being scalable (e.g., an application that can perform a single task extremely quickly, but which falls over under load). They're both symbiotic parts in the software development relationship. Another way of looking at this is, performance is the ability to handle the service commitments of today, while scalability means being able to meet the performance commitments of tomorrow.

He also pointed out that with well-designed code, almost any optimization you make will make the code more brittle, complicated, and (possibly) less flexible. So when you're making an optimization, it's important to be aware of what you're doing and why you're doing it. (Which is sound advice for any sort of development, not just optimization-related.) One should design code to be refactored later, so that when changes are necessary, it's possible to make them.

George then described best practices for software development in general, and then as related to scalability and performance. These overlapped considerably with the earlier session on High Volume PHP & MySQL scaling Techniques, and yesterday's tutorial on improving the performance of PHP applications.

His general key points were to:
* profile code early and often: effective debugging profiling is about spotting deviations from the norm, and effective habitual profiling is about making the norm better.
* Developers and Operations need to have a good working relationship, so that when there is a problem, things go smoothly
* One should always test on production data, so that there are no surprises when dev code is made live
* And that assumptions will burn you. In the long run, any assumption you make will probably be wrong.

His points on scalability (which he tried to condense into one-word bullet points) were to "decouple" (isolate performance faults, refactor only where and when necessary, and split hardware only when needed, as it impairs your ability to efficiently join decoupled application sets); "cache" (the fundamental question here is, how dynamic does the data really have to be?); "federate" (aka "shards", as described in an earlier talk); "replicate" (but be aware that high write rates can cause replication lag); and "avoid straining hard-to-scale resources" (such as third-party data sources or black-box data).

An important point is that caching longevity can be a key issue: sometimes, caching data for only a few sections can lead to a performance win; it all depends on the frequency of requests.

The key points for performance are to use a complier cache (which just about everyone is saying now), be mindful of external data sources, avoid recursive or heavy looping code (which, apparently, is expensive in PHP and probably indicates you're doing something wrong), don't try to outsmart PHP (don't try to work around perceived inefficiencies with code that turns out to be less optimal; for example, trying to write a parser in PHP when a simple regex could do); and build software with caching in mind (but watch cache hit rates - if a cache is never actually used, it's a net loss to build it).

Overall, this was a good session, but (as seems to be a trend so far for the conference), it was cut short, and there was a fair bit of overlap with the other similar sessions.

zendconference2006