Anyone with children will know all about regression; when behaviour goes backwards and you wonder what on earth just happened. As adults we are also prone to regression and at work and under pressure many of us revert to behaviours that aren’t ones we would probably choose in some of our less pressured moments.
The actual term regression is attributed to Francis Galton, a prolific 19th century statistician (amongst many other things) and a half-cousin of Charles Darwin. In a statistical context, the term refers to the relationship between the selected values of x and the observed values of y which when used correctly can be incredibly powerful and enable the prediction of future events.
I’m particularly interested in regression at the moment because I want to further refine my thinking on scatterplots that appear to show a reliable correlation between course searches and enrolments. I’ve had very little formal training in statistics so when I spotted Stephen Few’s positive review of an O’Reilly publication called Head First Data Analysis, I thought I’d give it a go.
What a great book. OK, some of it is a little basic, but the 80-odd pages on statistical regression are just fantastic and walked me through the basic concepts and then extensions of them effortlessly. The Head First series isn’t something I’ve seen before but I really like the approach they use - the idea of keeping your brain busy and focussed on the content while learning some fairly detailed concepts - something I always struggle with when reading drier, more academic publications. I managed to consume all 400-odd pages in the course of a return flight to Sydney so I’d consider it light reading and highly effective.
If like me you now have a hunger for more detail, there is a reference to our old friend Edward Tufte right at the end of the book. In the 70’s Tufte published a book on regression called Data Analysis for Public Policy which is jammed full of theory and relevant examples. The best bit of all is it can be downloaded for free here.


Also well worth a look:
Howard Wainer (200): Picturing the Uncertain World
Wainer relates a lifetime’s experience in finding when statistics and their graphical representations don’t say what they appear to. He also acknowledges the influence of Edward Tufte.
Especially valuable for those who feel some charts they’ve seen don’t compare apples with apples and who want to avoid putting oranges (or even the occasional lemon!) in their own charts. Which is another way of saying, understand your variables or regression won’t work.
http://www.amazon.com/Picturing-Uncertain-World-Communicate-Uncertainty/dp/0691137595/ref=sr_1_1?ie=UTF8&s=books&qid=1253966731&sr=8-1