Programming and Attention to Detail: 6_bleen

6_bleen_7

Programming and Attention to Detail

Feb 12, 2011 14:48

We are reminded all too often that computers are much, much faster than the human brain at certain tasks. One thing that learning a computer language quickly teaches you, however, is that what the brain lacks in speed, it more than makes up for in flexibility. A simple pattern-recognition task you could do instantly in your head may require a bazillion lines of code to implement in an algorithm a computer can actually process.

Experience in programming isn’t required to realize this-but it sure helps. Case in point: I have a tyro’s grasp of several programming languages. I’m not a true hacker-my code is simplistic and not very efficient-but I can write a script to do any task I might need in my capacity as a statistician. I can also predict very roughly how complex a particular hacking job is likely to be.

The problem is that many of my colleagues, who have no programming experience whatsoever, love to propose grand schemes of analysis that sound conceptually easy but would take untold hours of skull-splitting concentration to automate. Invariably, these nebulous proposals begin with, “It would be really nice if we could….” And guess who gets to implement them (hint: not the person who voices the idea).

Just last week, in the project I spend the most time on these days, we discussed how to choose an optimal subset of genetic markers from a much larger set. We wanted to “cover” the human genome as well as possible by choosing markers that were as uncorrelated with each other as possible. (That way, we wouldn’t be wasting our effort collecting redundant information.) A number of programs exist to do that, but we had additional criteria in mind, as well, so whatever we chose to do, I had to throw together a Perl script to implement it.

I’ve never seen so much literal hand-waving. My papers were rustling in the wind struck up by all the flapping limbs. Everyone was thinking about exploring the correlation structure among the markers, because that’s a pretty standard thing to do, but what everyone had in mind was a cute little graph like this, where sets of correlated markers are color-coded as red triangles, with the intensity of the red color corresponding to the strength of association. Such an analysis provides a general overview of the correlations, but is useless insofar as choosing a well-defined set of “tag” markers is concerned.

And it goes without saying that all the suggestions began with, “Wouldn’t it be nice if…[wave, wave, wave].”* Yeah, it would be nice, but since we need the results by tomorrow morning, and not next August, we’ll have to settle for something a little less ambitious.

The afternoon of this particular meeting, I was testier than usual, having had five hours’ sleep because I’d stayed up half the previous night coding the last urgent analysis project, and so I quickly ran out of patience for this sort of thing. I finally interrupted the game of people-airily-proposing-grand-schemes-that-they-don’t-have-to-implement-by-tomorrow-morning by reminding all and sundry that I was running on five hours’ sleep, and consequently that my productivity that evening would be limited; and since it was already four-thirty in the afternoon, if the boss (who was there) actually wanted the results on time, we’d better find a solution I could code in, say, half an hour.

As expected, the hand-waving resumed exactly as before-but I was ready. Each time someone made what they thought was a “straightforward” suggestion, I asked in my sweetest voice, “Could you please sketch me a quick algorithm to accomplish that?” Worked like a charm: in two minutes I’d silenced the whole crowd. The clinical boss, who doesn’t know as much about molecular genetics as the lab folks, made the only sensible suggestion of the afternoon: that I abandon the correlation structure and go solely by the physical distance between markers. (The physical distance is a decent proxy for correlation if the markers aren’t too close together.)

I finished off that project in fifteen minutes. Using Excel, of all things.

Rant over. This whole episode got me thinking about a connection between programming and teaching. Both require breaking down a problem or idea and thoroughly understanding each little piece. If you don’t have an intimate and detailed grasp of your subject, you’ll never be able to explain it to anyone else so they can understand it, just like you can’t bullshit a computer language into giving you what you want. Of course, some teachers at times don’t fully comprehend what they’re teaching, and everybody can tell. Easily.

The implication is that good programmers ought to make good teachers, but I have not observed that such is necessarily the case. I think I know why: there is a big difference between fully understanding a subject and being able to explain it to a neophyte such that comprehension is successfully conferred. The latter requires an additional, uncommon skill: the capacity to lay aside one’s expertise and to put oneself in the student’s place. When I teach I try to remind myself as often as possible: “What do I need to tell someone who has never heard of X to make them comfortable with learning X?” Naturally, every student’s background is different, so there is no single best answer.

Having too much knowledge can also be a handicap in teaching. (Undergraduates at large universities often complain about professors who are teaching at far too high a level.) There are things to be said in favor of the teacher being just barely ahead of the students: the conceptual gap is much more easily breached that way. My experience in the lab meeting showed in a somewhat different way how expertise could be a liability: the person who knew the least about what was involved in the task gave the best solution, and one that had not occurred to me.

_______________________

*You could think of this as a variation on the WIBATT, an acronym for "Would it be any trouble to...." If it has to be asked, the answer is invariably yes.

genetics, rant