Do Programmers Practice Computer Science?
I’ve gotten some questions about the idea of what we do being scientific or not, and that raises some interesting discussion points. This is a “You Asked for It” post, and in this one, I’m just going to dive right into the reader question. Don’t bury the lead, as it were.
Having attended many workshops on Agile from prominent players in the field, as well as working in teams attempting to use Agile I can’t help but think that there is nothing Scientific about it at all. Most lectures and books pander to pop psychology, and the tenants of Agile themselves are not backed up by any serious studies as far as I’m aware.
In my opinion we have a Software Engineering Methodology Racket. It’s all anecdotes and common sense, with almost no attention on studying outcomes and collecting evidence.
So my question is: Do you think there is a lack of evidence based software engineering and academic rigor in the industry? Do too many of us simply jump on the latest fad without really questioning the claims made by their creators?
I love this. It’s refreshingly skeptical, and it captures a sentiment that I share and understand. Also notice that this isn’t a person saying, “I’ve heard about this stuff from a distance, and it’s wrong,” but rather, “I’ve lived this stuff and I’m wondering if the emperor lacks clothes.” I try to avoid criticizing things unless I’ve lived them, myself, and I tend to try something if my impulse is to criticize.
I’ll offer two short answers to the questions first. Yes and yes. There is certainly a lack of evidence-based methodology around what we do, and I attribute this largely to the fact that it’s really, really hard to gather and interpret the evidence. And, in the absence of science, yes, we collectively tend to turn to tribal ritual. It’s a God of the Gaps scenario; we haven’t yet figured out why it rains, so we dance and assume that it pleases a rain God.
Evidence is Hard
When I say “evidence is hard,” please don’t hear my voice sounding satirically whiny in your mind. I’m not being snarky; I’m being dead serious. It’s really hard to gather and synthesize evidence of efficacy in the field of software engineering. I understand this, perhaps more keenly than some, because I took graduate level CS and software engineering courses where I read and wrote white papers on relevant industry topics.
I can’t recall the exact details, but I recall reading a paper in a course called, “Advanced Topics in Software Engineering” that made a study of the relationship between the length of methods in a code base and the incidence of bugs. Doubtless, this was a well-conceived attempt to answer the question “how long should your methods be?” once and for all. I’d say this is deceptively simple, but I doubt anyone is deceived. You can, no doubt, imagine problems right off the bat, particularly in attempting this a decade ago.
We probably have to pick a tech stack immediately, and hope it’s representative. After all, not all lines of code are created equally across languages and paradigms. We also have to go out and find a code base that’s in something resembling production because, absent that, “bug” is sort of meaningless. But, even assuming we find that, how do we know the rate of bug reporting is representative? I mean, what if the code is in production and it has no users? What if it has users that just don’t care enough to report bugs? And what exactly is a bug, anyway? You can see where this is going.
But before I finish asking study-confounding questions, let’s get to perhaps the most fundamentally bothersome ones. What production code can we find that isn’t jealously guarded as intellectual property, and whose outcomes can we measure that don’t mind their dirty laundry being aired? Do you think, especially 10 years ago, that Microsoft and Apple were in a race to let the world measure their OS method lengths against their defect counts?
And so when we do these types of studies, we conduct them using open source projects to represent the average software project, and we use people like grad students and undergrads to represent the average developer. Even if the methodology were flawless and the results compelling for any given study, we’ve successfully studied something that doesn’t begin to represent the industry.
I’m not saying that there haven’t been studies that peeked inside corporate walls — I can’t make that claim. I’m just looking to present an ipso facto challenge that has existed when it comes to software construction that never existed for, say, biology. Biology would be a lot harder to get right if nearly every organism on Earth were owned by some company or another.
Non-Parallels
I don’t want to belabor the point unnecessarily, so if you accept my premise that it’s really hard to run experiments correlating software construction techniques with qualitative outcomes, you’ll agree that we’re faced with a daunting, but theoretically possible task. We could, theoretically, pry people’s jealous fingers off of their source code (and one might argue that we’re moving slowly in this direction as an industry) and start to inject more transparency into approaches and outcomes.
But even with that, it’s important to note that there are fundamental differences between what experimental scientists do and what we, as commercial software developers, do. A physicist, for instance, observes phenomena in the world at large, hypothesizes as to cause, runs experiments to confirm or falsify, and molds those into theories. The scientist studies nature and makes predictions.
Let’s consider three actors in the realm of physics, as a science.
- A physicist, who runs electricity through things to see if they explode.
- An electrical engineer, who takes the knowledge of what explodes from the physicist and designs circuitry for houses.
- An electrician, who builds houses using the circuits designed by the electrical engineer.
I list these out to illustrate that there are layers of abstraction on top of actual science. Is an electrician a scientist, and does the electrician use science? Well, no, not really. His work isn’t advancing the cause of physics, even if he is indirectly using its principles.
Let’s do a quick exercise that might be a bit sobering when we think of “computer science.” We’ll consider another three actors.
- Discrete mathematician, looking to win herself a Fields medal for a polynomial time factoring algorithm.
- R&D programmer, taking the best factoring algorithms and turning them into RSA libraries.
- Line of business programmer, securing company’s Sharepoint against script kiddies uploading porn.
Programming is knowledge work and non-repetitive, so the comparison is unfair in some ways. But, nevertheless, what we do is a lot more like what an electrician does than what a scientist does. We’re not getting paid to run experiments — we’re getting paid to build things.
Which means that, if we’re going to be applying science to the field of programming, we’re not the scientists. We’re the subjects.
Here’s a third set of actors.
- Neuroscientist studies cognition and sees how humans most effectively process sequential instructions.
- Language designer uses that knowledge to build a programming language that is maximally grokkable by programmers.
- Savvy programming teams use that programming language.
As I said, we, as programmers, build things. If science is going to be part of the discussion, it’s almost certainly going to be drawn from disciplines around cognition, psychology, and systems theory, and it’s going to be applied to things that we can use to be more effective.
Methodology Rackets
To return to the original thrust of the question, then, it’s apparent that there’s some sort of impedance mismatch. “How do we make humans most effective at automating and building things” is not a question that we solve by writing and tweaking how we write software, but rather, by studying humans and systems. Writing software and then looking back at our experience writing that software doesn’t produce science — it produces anecdotes. And, to quote something I heard from a man named Mark Crislip, whose podcast I love (don’t think it’s his originally), “the plural of anecdote is ‘anecdotes’ — not ‘data’.”
There are a lot of people out there who make a lot of money describing the best ways to write software. Is software best written in offices, cubicles, or flat tables in warehouse-like spaces? Should you write automated tests before or after you write production code? Heck, if you want to see some strangely angry people come out of the woodwork, go on twitter and talk about whether you should estimate software projects. And, do you know what? It’s all largely based on personal experience, shared collectively and writ large.
Certainly, there are practices that have higher success rates than others, and even measurably so. Electricians can certainly tell you which wire caps have lasted the longest and which have resulted in repeat visits. But nevertheless, any process approach should come with the heavy, ever-present caveat of, “this is something I’ve tried and found success with, so it’s possible that you might also.” I’d be intensely leery of any stronger suggestions or promises for success when it comes to software process. One of the best things to take away from the good lessons I’ve seen offered by the agile movement is “inspect and adapt.” It’s not prescriptive, and no one’s getting certified, but it’s certainly honest and well intentioned. All you can really do for now is try stuff, see if it works and adjust accordingly as you go.
By the way, if you liked this post and you're new here, check out this page as a good place to start for more content that you might enjoy.
Thanks for your thoughts Erik. I agree. But I haven’t seen that many people asking if John Q Software Developer is himself a computer scientist. Usually the question I encounter is “do you need to know science to be a developer?” You don’t need to know any science to write a program. That’s one of the awesome things about our high level languages like Python or Javascript; I can show a child how to mess with it and they might go on to be a great developer. Or maybe they’ll just mess around and have a good time learning to… Read more »
I think you’re right that I broadened the scope of the discussion beyond what was asked (or at least implied as the focus). The sense I got from the question was that the person asking it has had experiences being told something like, “we’re going to do an open office seating plan and have daily standups because that’s agile and agile is right.” I broadened the topic because I don’t think that this is unique to flavors of agile, but rather a result of the nature of the software game in general. When it comes to thinking of software developers… Read more »
For more on the subject of evidence being used to drive software engineering methodologies, check out the work of Greg Wilson and Andreas Stefik. They were interviewed on the Ruby Rogues podcast on the topic of “What We Actually Know About Software Development and Why We Believe It’s True” (http://devchat.tv/ruby-rogues/184-rr-what-we-actually-know-about-software-development-and-why-we-believe-it-s-true-with-greg-wilson-and-andreas-stefik/) For example, on Erik’s point about programming language design, Stefik has been working on the Quorum language, a language specifically designed to be “used by humans” as it were. One case study I came across would fit under what Erik described as an anecdote, but still, it is nice to… Read more »
Oops, messed up the links.
http://devchat.tv/ruby-rogues/184-rr-what-we-actually-know-about-software-development-and-why-we-believe-it-s-true-with-greg-wilson-and-andreas-stefik/
http://www.upedu.org/papers/ICSE2015_OrganizationalFactors/LavalleeRobillard_ICSE2015_WhyGoodDevelopersWriteBadCode.pdf
Thanks for the links. I’ve got the Ruby Rogues podcast queued up. From a consumption perspective, studies, anecdotes, etc — all interesting. My post here wasn’t to say it’s bad to share experiences and try to improve that way, but rather to draw a distinction between scientific inquiry and applied programming.
So Mark Crislip is an infectious disease doctor. How about this:
How much does it cost to cure Strep Throat? Well, take this penicillin for 10 days which costs $15 at the pharmacy, and you should start feeling better in the next 2-3 days.
How much does it cost to cure Ebola? Well…. uhh, we’re going to try this, and this and if that doesn’t work maybe this, and there’s a good chance nothing works in which case please don’t blame me.
Yes, I am comparing many software projects to Ebola. 🙂
You had me at “comparing many software projects to Ebola.” 🙂
I certainly think there’s a lot of trial and error in software development, which makes sense, since it’s knowledge work. I suppose, after a fashion, that’s experimentation, but the goal isn’t repeatable results as much as it is getting a product to market. At least for most of us, anyway. For me, usually, when I’m building software.
Great topic Erik. Well I think that computers science is more about how to build a computer rather that how to program one. Or in the later case limited to making the computer working the OS indeed. The problem rise when we try to write software for other people needs, then we move into the art of programming or the sw craftsman and of course we add all the ambiguity, complexity and irrationality related with human topics. And yes I would response yes and yes to the two questions you raised.
I’ve always thought of the hardware/chip design element as Computer Engineering versus discrete math-oriented Computer Science. But yeah, out in the field, it’s probably neither 🙂
If you have a Bachelor’s degree, it means you understand the basics of the Science and one can apply his/her knowledge. Very few individuals at the Bachelor’s level in any Science can claim they’re practicing their Science. Hence this article is misleading.
I’m not sure I can agree that the degree obtained by a programmer has any bearing on whether that programmer is a practicing scientist. You could take MIT PhDs in Machine Learning and, if you made them write CRUD apps for a bank website, they still wouldn’t be practicing science. They’d be writing CRUD apps for a bank.
Great article with some keen observations! However, I suggest that the non-parallel of mathematician -> R&D programmer -> LOB Sharepoint developer overreaches a tad. The point is well taken, in that what we’re doing is fundamentally different from wiring a house, yet still not “computer science.” Still, the work I do as a business programmer informs and is interacted with in a complex production environment–both from an infrastructure and a business marketplace POV–and producing programs that will have the outcome (influence business profits/user behavior?) in the way that management, analysts, and I intend is, I feel, its own branch of… Read more »
Here’s a take you might find interesting. Gene is a blogger I read and, he wrote the following as a take on this post: https://genehughson.wordpress.com/2015/10/26/first-do-no-harm-the-practice-of-software-development/
He draws interesting parallels to what we as programmers do and what doctors do.
I agree, most development isn’t computer science, and methodologies aren’t scientifically-designed. One of the reasons why is completely unreliable input. I’m not using a pH meter or radio telescope to gather my data, I’m asking people what they need in a system. The chance that they’re lying, in hopes that it will increase their department status or make them indispensable, is a real possibility. Even more likely is that they don’t think of the task analytically, so “X never happens” really means “X happens fewer than 20 times a day”. We’re working with a Ouija board, not a multimeter. And… Read more »
I completely agree, and those last 2 sentences are awesome 🙂
The point of my post wasn’t to detract at all from what programmers do, but rather to express the opinion that any science around what we do wouldn’t be ‘clean,’ so to speak.
I guess sociology and anthropology are not sciences then. They deal with the softness of human response. There is some pretty good science involved in getting requirements right. Perhaps only some software developers are scientists is a better claim.
Programmers use science (or more properly the Scientific Method) every day. They make hypotheses (This program will work and produce the expected result), then perform experiments to support or refute the hypothesis. When the program fails, they make hypotheses about the cause of the error and go looking for it. Do programmers also work within established models? Sure, so do physicists. Newtonian motion is an established model. So is relativity. At different energy levels, physicists try to fit their thinking into one of these established models. That doesn’t make them tradesmen. Do programmers, and their managers, succumb to magical thinking… Read more »
I think that the framework in your first paragraph casts a pretty wide net. For instance, if my sink won’t drain, I form a hypothesis (“I think the drain might be clogged just below the sink.”) I then run an experiment (“I open the drain and look”) to support or refute my hypothesis. Depending on whether I’m right or not, I form another hypothesis (“maybe something is stuck on the other side of the P trap.”) I think the key differentiator, with both my sink and any program that I write, lies in the fact that I’m not analyzing results,… Read more »
Of course it is a matter of degree. A program is not like a sink. There are hundreds of free variables in the experiment. The programmer must analyze the evidence, which is often very indirect, and make a very detailed hypothesis about where to look for the fault. If da sink don’t drain, it doesn’t take a physicist to guess there’s a clog. I might continue the metaphor to say that sale is presentation and customers are reviewers. There is more obvious publication and peer review in the open source world, though it doesn’t *look* like traditional academic science. Instead… Read more »
I’m inclined to agree with you that publication and review don’t make one a scientist, in the same way that I don’t think forming hypothesis and running experiments makes one a scientist. Someone who just publishes is an author, as you point out. Someone who hypothesizes and runs experiments is just a problem solver, even if he’s solving complex problems. My contention isn’t that programming isn’t hard or that it doesn’t require an analytical, intelligent mind. A post I linked to in another comment drew an interesting parallel to primary care from a doctor — it’s rigorous intellectual work, and… Read more »
Earlier you said that publication and peer review marked a scientist. You’ve walked that one back. So, what makes a scientist that isn’t using the Scientific Method? I agree that programming has important parallels with medicine. But I think doctors do science. I think medicine is most ordinary peoples’ point of contact with science. I also think that people who want to make categorical statements that field-of-study-X is or is not a science may have an axe to grind. What is gained by sorting people who do the same kind of mental activity into “is” and “not” categories, if it’s… Read more »
My position on publication and peer review wasn’t that those “mark a scientist.” You said that programming was science (or the Scientific Method) because programmers form hypotheses and run experiments. I said that this is also true of someone unclogging a sink, so it seems to me that “hypotheses and experiments” were too loose a set of criteria. So I proposed adding “publish results and peer review” to “hypotheses and experiments” to eliminate drain unclogging from the realm of science (but also most day to day programming). It was “and also” — not “instead of.” I’m not sure if you’re… Read more »
No, not meaning you personally when I said “an axe to grind.” I don’t know you. How could I form such an opinion? ACM Communications, a top journal of academic computer science, devotes endless pages to tiresome, hand-wringing discussions of whether or not computer science is a science, that I expect is related to the perceived prestige of academic CS folks. It isn’t clear why one would hold the debate if there were not some value hanging upon the outcome. It is my humble and very individual opinion that you are a scientist if you do science, that is, if… Read more »
Ah, okay. That’s definitely a different definition than my operating one (which admittedly I haven’t specified in detail). I think of science as the following of the scientific method for the purpose of advancing knowledge and generating falsifiable explanations of observable phenomena. So, for me, the activity of programming, even with published results and peer review (call this code commits and reviews, I suppose) wouldn’t fit the bill because it’s not done with the purpose of advancing knowledge or explaining anything. I suppose one could make some interesting arguments that I’m generating explanations, but that’s not the goal of the… Read more »
Ah, ok. Wish I had been better able to elicit your thinking on what a scientist is earlier. That’s quite interesting, and I don’t disagree with this definition. Let me just say that the ultimate purpose of programming is the discovery of a procedure to obtain some result, and the distillation or crystalization (pick your science metaphor) of the knowledge into tangible form as a program to perform the procedure. When a programming project begins, the knowledge doesn’t exist, or monkeys could write the program. Programmers discover or create it. You may say that the procedure for a login page… Read more »
In retrospect, it probably would have been helpful to state my definition somewhere in the post 🙂
And I get what you’re saying here too. I tend to think of programming as a highly creative pursuit, so when I deal with enterprise clients, their tendency to want to commoditize it and turn it into predictable, fungible chunks of work depresses me.
I would like to observe that there’s a huge difference between herding cats (programmers) and herding electrons (physicists). If you look at the CMM, it’s five stages break down as: 1. Control the inputs and outputs from your ‘black box’ (the software team). 2. Allow each developer to define their contract with the rest of the organization. 3. Stimulate developers to regularize their contracts. 4. Develop systems for monitoring the behavior of the regularized ‘white box’ (which pragmatically is impossible prior to level 3). 5. Develop methods for intervention (propagation of success and mitigation of disasters). It is a management… Read more »
I’ve never seen CMM articulated that way. That’s interesting.
I’ve also seen a lot of different arguments about the difference between a programmer and a “proper software engineer,” so to speak. I’m not sure I’ve ever formed much of an opinion on that subject, though..
As you practice you build your skills, sometimes it is enough to go to mountain for a hike and sometimes game of basketball would do it.
Great post, Erik. I just came across it after writing on my own blog on basically the same topic: http://blog.rossjohnson.org/skepticism-meets-software-development/ but of course not as eloquently or thoughtfully as you. Your post does a great job at framing in my mind the difference between what we do as developers actually *being* science and whether or not what we do is actually *informed by* science. I definitely think computer science is not science and we generally don’t do it. But that doesn’t mean it’s not valuable or that we shouldn’t look to science to help us understand how to do our… Read more »
Thanks for the kind words, and for what it’s worth, I like your post 🙂 I just recently watched that same talk, actually. Here’s another thing you may find interesting: https://leanpub.com/leprechauns
I like your distinction between doing science and being informed by it. I think that’s a nice way to put it. I believe that the scientific method informs a lot of what I do in life, even though I’m rarely practicing science, per se.
A very wise but slightly naive possum named Pogo once observed that, “Man does not see the handwriting on the wall until his back is up against it.” I think that this is apropos in the current context. Companies that would never hire a physicist to design a car or an elevator or an airliner are quite happy to hire untrained software developers and graduates of Computer SCIENCE to produce potential company killing applications. The people in the Science faculty are not tasked with producing Engineers. And the older the person is the more likely he or she is to… Read more »
The software industry, in particular, certainly does attached to its favored processes, data notwithstanding. I’d be interested if you had any old whitepaper/writeup on what you mentioned about the cost of inspection versus unit testing, though I have to imagine the odds that you’d be allowed to share such a thing would be remote.
Love the quote, by the way. Was never much of a comic reader, myself, but there’s certainly a Yogi-Berra-like wisdom in that statement.