May 30, 2007

Pimp My Code, Part 14: Be Inflexible!

I ask you, grasshopper, which is better: flexible code or tiny code?

"Ah," you exclaim, "Learned master, it is a trick question: code which is tiny yet flexible is best!"

WRONG! Tiny code is always best. Now you must carry water up the hill for the rest of the day.

--

What can we learn from this simple tale? Well, one thing is, I'm not very good at writing stories. But is there something deeper?

When I first started coding, way, long ago (on a PDP-11, which was essentially when I'd get eleven Pterodactyl Dinosaurs to sit down and do some Processing for me), I thought code should be flexible at all costs. If I were creating, say, a program to write a one-line message to the screens of other people logged in to the same machine, I'd write it, say, so you could plug in other kinds of screen-manipulation packages besides curses(3). Even though, you know, none would be invented until "time sharing" was something you did with condos, not processors.

The fundamental nature of coding is that our task, as programmers, is to recognize that every decision we make is a trade-off. To be a master programmer is to understand the nature of these trade-offs, and be conscious of them in everything we write.

You've probably seen some variant of this, but I'll show you my version. In coding, you have many dimensions in which you can rate code:

- Brevity of code
- Featurefulness
- Speed of execution
- Time spent coding
- Robustness
- Flexibility

Now, remember, these dimensions are all in opposition to one another. You can spend a three days writing a routine which is really beautiful AND fast, so you've gotten two of your dimensions up, but you've spent THREE DAYS, so the "time spent coding" dimension is WAY down.

So, when is this worth it? How do we make these decisions?

The answer turns out to be very sane, very simple, and also the one nobody, ever, listens to:

"START WITH BREVITY. Increase the other dimensions AS REQUIRED BY TESTING."

--

In Delicious Library 2 we have a feature where we will automatically find the libraries of your friends if they have published them. Of course, with any matching system, the trick is, how do you know this is really *my* friend named "Mike Lee", and not one of the other 17 million "Mike Lee"s around the world (that whore).

So, I came up with a basic algorithm for pulling all Delicious Libraries by the owner's name first, then disambiguating within those afterwards. One of my programmers said, "Well, what happens if we get 1,000 hits for John Smith?"

And my reply, not at all tongue-in-cheek, was, "Well, then we would be very, very rich." Seriously, if 1,000 John Smiths had registered our program, think of how many customers we'd have total: millions. Multiply that by $40, and one possible response to the problem of "too many John Smiths" would be: "Who cares, let's all move to Tahiti and spend the rest of our lives on the beach sipping rum."

I kid! Mostly. My point is, we'll have PLENTY of warning and PLENTY of resources when 1,000 John Smiths start to plague us. Note that both of these are necessary -- if I were, say, deploying just some website, and I didn't make money based on the number of people using the site, I'd be a lot more worried about it blowing up before I was ready -- there would be no guarantee that I'd have the time or resources needed to handle the problem.

In this particular case, there is a slightly slower (execution time) way to do the search that would eliminate the 1,000 John Smith problem, which I will do the day it starts to become a problem. Then I'll push a free update, and my customers will never know that it could have been an issue.

But note that, even though I *know* how to solve this problem (and increase the flexibility dimension), I'm not going to solve it now. Why? Because (a) this would kill my code brevity, (b) it would make the program run slower for everyone in the meantime, (c) it would introduce more instability, and (d) it would take a bunch of time to program and debug, so I couldn't do other, cool features.

This is really key: there's a solution out there that I know is more flexible -- many people would instantly consider this the "best" solution, and consider everything else a hack. My point, which I'll say again and again, is that there are MANY dimensions with which to evaluate any solution to a problem, and flexibility is NOT paramount.

--

When most people learn objective languages, the first thing they do is go ape. I mean, they create superclasses that have one method, which is stubbed out, and twenty children classes, each of which varies by one line of code. They fall so in love with objects that they think everything needs to be its OWN TYPE of object.

Often this is done in the name of flexibility. "Look, I have this abstract superclass which currently does the drawing for all my buttons, but you could subclass it to, say, draw 3D text!"

There is a related ailment, which is the "complete class" syndrome. Many programmers, when they create a new class, add a ton of features to that class to make it "complete" -- that is, they try to anticipate everyone who may ever use this class, and they add methods that those hypothetical users may want.

Let's say, for instance, Apple didn't have an NSArray class. So, you write your own. Great! I support you. Now, in your program you need to add objects to the end of the array and remove them from the end of the array. Ok, write those methods. But, wait, you say. Maybe I should add some more methods? Get an object from any index? Insert at the beginning? Why don't I make this more flexible, you say? NO! NO NO NO!

Now, you may be saying to yourself, "What's wrong with flexibility?" Strangely, I was about to tell you. The problem is YOU ARE NOT A LIBRARY PROGRAMMER. YOU WRITE APPLICATIONS. (Note to Ali Ozer: IGNORE THIS SECTION.)

If you find yourself writing a class for your "library," then:

(a) You're not writing your application, which is where you make your money,
(b) You're writing something that you're hoping Apple will someday replace, which is a sucker's game,
(c) You're writing code you are going to have to test SEPARATELY from your app, because BY DEFINITION you've added functionality you didn't need,
(d) You're never going to really know which methods in your library work and which ones don't (eg, which ones are used in shipping programs) because you don't have user base that a company like Apple does (and witness how buggy even their under-used frameworks are),
(e) You're writing code that is going to need documenting (or some other way to comprehend it), so you're requiring yourself and everyone at your company to understand not JUST all of Apple's APIs (which are, at least, SOMETIMES documented) but also yours, and, possibly worst of all,
(f) You are attempting to predict how your application's needs will change in the future, and spending time NOW on your guess, instead of shipping the damn application, getting feedback, and THEN making changes.

Let's look more closely at (f). It's the same old thing again, isn't it? "Don't optimize your code until after you time it" becomes "Don't make your code more flexible until after you have a plan for what your app."

--

Here are some concrete rules I enforce at Delicious Monster, now:

- We don't add code to a class unless we actually are calling that code.

- We don't make a superclass of class 'a' until AFTER we write another class 'b' that shares code with 'a' AND WORKS. Eg, first you copy your code over, and get it working, THEN you look at what's common between 'a' and 'b', and THEN you can make an abstract superclass 'c' for both of them.

- We don't make a class flexible enough to be used multiple places in the program until AFTER we have another place we need to use it.

- We don't move a class into our company-wide "Shared" repository unless it's actually used by two programs.

--

So, next time your boss tells you to "be more flexible," tell him Wil Shipley says you shouldn't. He'll probably give you a raise!

Labels:

35 Comments:

Anonymous Isaiah said...

Actually, your won't give you a raise. He'll stay up late and rewrite your code to be more "flexible." In the process he'll break it beyond all hope of repair. And blame the schedule slip on your sorry ass.
When you refuse to fix and abondon his shitty group you'll be put on the slave-driver's project -- which is great because it will be a successful project and life will seem good for a while. But you'll be so stressed out all the time that you won't sleep anymore.
In your insomniac stupor you'll start writing your own code in your own spare time. Pretty soon you'll release something that people actually dig(g) and the internet will back a "tube" up to your house and dump a big pile of money on your porch.
Then you'll be buying Lotuses and pimpin' code and all things that are good.
So listen to Will, you might not get a raise, but maybe that's OK.

May 30, 2007 4:45 PM

 
Anonymous Anonymous said...

Pterodactyls are pterosaurs, NOT dinosaurs. Honestly, the quality of paleontology on code-related blogs has really gone downhill lately...

May 30, 2007 10:14 PM

 
Blogger Khooee said...

That is why you do things like TDD - in a sense, you model the use cases, which serve as tests. You can in turn derive the interface contracts from them and then write code that you use - which you can then measure code coverage (for both unit & integration testing).

May 30, 2007 10:34 PM

 
Blogger Wil Shipley said...

OMG none of those words mean anything to me.

I just write software.

May 31, 2007 12:35 AM

 
Blogger John Gustafsson said...

Wil, dude, you gotta write all of your coding wisdom down in a book and publish it. I constantly preach similar types of wisdom (although they are sometimes slightly less wisdommy, and more often in Swedish) and I would love to be able to push a book on people instead of having to repeat myself (I do tell them to read your blog though:)).

Oh, and btw, I do write libraries and not applications, but I have found that there is a magical balance you have to obtain to make a truly good library. Just throwing in functionality left and right is really not the best of choices, and most of what you say still holds true. Write what you need, first, is always a very very good advice. That way you have time to spare to add whatever else is necessary, or optimize it for whatever purpose you desire.

As your example with the array. Don't give the user *All* the permutations of methods you can think of, implement the most useful ones and then give the developer the tools to extend the class (inheritance, categories, etc). That way you keep things brief (not boxers!) and with a carefully chosen amount of flexible.

May 31, 2007 4:15 AM

 
Blogger dave said...

Don't worry about it Will... no one else really understands it either. But they'll pay a lot of $ to consultants so the consultants can tell them all the things they are doing wrong.

TDD is great... if you have the time and a well understood problem. Or at least one that has been spec'd out completely... Oops... there's that time axis again.

Seriously... TDD taken to it's logical extreme results in very robust apps that take just shy of forever to write and close to impossible to update once you realize what the program really needs to do (got to rewrite all those unit and integration test cases for each little change).

A sane approach does automated testing where it's easy to do and worthwhile. And leaves he rest to plain old monkey clicking the mouse. 100% code coverage is for suckers and consultants.

On flexibility... if you have not realized by your 2nd or 3rd project that writing code that is never used is a waste of time and a huge source of future bugs (when someone tries to use it) then I'm sorry to say you're kind of a sucky programmer.

May 31, 2007 6:29 AM

 
Anonymous Jonathan Grynspan said...

Gah! Buzzwords!

May 31, 2007 7:04 AM

 
Anonymous Anonymous said...

Yeah, as one of my friends says: "Infinitely flexible programs can't stand on their own."

May 31, 2007 7:57 AM

 
Anonymous Anonymous said...

Right on, Wil

May 31, 2007 9:15 AM

 
Anonymous Peter Maurer said...

Yay! I've always felt dirty for not preparing my code for each and every possible future development. Now I feel somewhat cleaner. And I'll be even more resistant to what I like to call hyper-abstraction.

Thanks Wil! :-D

May 31, 2007 9:50 AM

 
Anonymous Zach said...

I've been doing more in php than I care to. Much of it involves access to a database. I tried for a while to use DBI, or one of the other "database abstraction" solutions out there.

I ended up writing my own. It has one function, dbQuery(sql). I have a single file (sql.php) that I require_once whenever I need database access.

I always meant to go back and add the other functionality that db abstraction tools have, but never ran into the case where I actually needed that functionality.

Time to write: 10 minutes
Time making revisions since: < 10 minutes

And I've been using that small library for better than 3 years now.

Thank you for providing a well-respected opinion I can point to when someone tells me I need to flesh that script out more. :)

It also provides a good example showing that sometimes being small means reinventing the wheel, especially when the current wheel has spinners and LED stemcaps that would really look gaudy on your vehicle.

May 31, 2007 3:18 PM

 
Anonymous Anonymous said...

Yes, test driven design taken to its extreme will require a lot of effort, but as in many other regards, the 80-20 rule does the trick. The dosage makes the poison. Increase coverage in critical areas to give you confidence but don't go overboard aiming for 100%.

Test driven design in my experience guides you where Wil urges you to go: By writing the test first, you express what's needed (interface contract) and code what's required. It's a more formal approach to "Be inflexible", in the sense that you write code that serves as a test *and* makes sure you are as flexible as you need be.

Coverage will tell you where you've gone wild and written dead code.

June 01, 2007 1:07 AM

 
Anonymous Anonymous said...

"My point is, we'll have PLENTY of warning and PLENTY of resources when 1,000 John Smiths start to plague us. "

10 bucks says you you forget.

June 01, 2007 3:16 AM

 
Blogger Wil Shipley said...

Well, I may forget... I guess I wasn't clear. I don't expect to REMEMBER the problem. I expect that as soon as we get close to having the problem, Mike or Lucas or Terry or whoever is doing support then will say, "Hey, some customers are complaining that there are too many entries in the friend matching popup," and I'll fix it and put out a point release for everyone.

Unless it doesn't become a problem before the 3.0 release, which I don't think it will. Let me stress that, for there to be, say, 1,000 Mike Lees means that we'd have made like $300 million dollars. I don't actually think we'll do this with version 2.0.

-W

June 01, 2007 3:28 AM

 
Anonymous Chad said...

As one of my professors summed it up: "Get it working first, then optimize it."

I believe that is the direction Apple took with Mac OS X. With all of the delays they had on getting it out the door initially, they probably got to the point where they just wanted to get something out the door and would spend the next six months improving the original design.

I'm just hoping that's what Microsoft is doing with Vista. After over five years of development, they probably just decided to get it out. My hope is that Vista SP1 will show some speed gains so it won't take a $2000 workstation to run it even half-way decently.

June 01, 2007 1:57 PM

 
Anonymous Anonymous said...

Yeah, I like the idea of TDD but i'm just not smart enough to fully hash out the ideas before coding (especially things like method names or variable names). The testing and coding tend to happen in tandem. As I code a crack open the console and test (uh, rails development); then I copypaste that into my test files. Works for me. Same for bugs... track em down in the console, then write the tests to make sure it doesn't break again. Then I'm also not writing a bunch of tests that I don't need either. Tests are code too, and the more test code there is there more chance you have to break that too.

June 01, 2007 3:07 PM

 
Anonymous snoyes said...

From Dave's comment:
"100% code coverage is for suckers and consultants."

If only that was true. Depending on the environment/industry (I am not talking desktop applications here), some of us get to do 100% code and 100% multi-conditional decisional coverage.

I miss desktop application programming...

Wil: good article. You have no idea the number of times I have had to fight people on "but I want to make the code generic and re-useable. (doubling the size of code)" And then trying to convince them it will never be re-used.

June 01, 2007 5:04 PM

 
Anonymous Rouslan Grabar said...

>>- We don't make a superclass of class 'a' until AFTER we write another class 'b' that shares code with 'a' AND WORKS. Eg, first you copy your code over, and get it working, THEN you look at what's common between 'a' and 'b', and THEN you can make an abstract superclass 'c' for both of them.

unless classes 'a' and 'b' are 100% semantically related, it's better to have class 'b' to use class 'a' as a private member.

June 02, 2007 12:29 PM

 
Blogger dave said...

Snoyes sez: ----------------------------
From Dave's comment:
"100% code coverage is for suckers and consultants."

If only that was true. Depending on the environment/industry (I am not talking desktop applications here), some of us get to do 100% code and 100% multi-conditional decisional coverage.

I miss desktop application programming...

Wil: good article. You have no idea the number of times I have had to fight people on "but I want to make the code generic and re-useable. (doubling the size of code)" And then trying to convince them it will never be re-used.

----------------------------------------

But it IS true for 90%+ of the software out there. If you're building life critical apps then I too really hope you have tests covering everything possible. (I also hope you're using a realtime os and some language other than a C variant!)

The point is that to do this will take a LONG time and LOTS of money compared to most desktop/web type apps. And the comment form khooee was basically 'hey use TDD and fix all your problems'. Which is naive at best.

You're second point is exactly mine (and Wills)... so we're all happy here...

June 04, 2007 12:56 PM

 
Anonymous Pierre Lebeaupin said...

Wil: Hmm… in general I agree, but surely there is some amount of flexibility that should be put right from the start, right? Methods that are not actually used are of course wrong, but for instance it's probably a bad idea to hardcode some stuff (even whith a symbolic constant), which would be better as parameters of the object (and hence stored as a member var, and have accessor methods and perhaps have their initWithStuff:); since the code that uses the object sets up the parameter, there is no unused code.

For instance, I recently wrote some sort of controller for a checkbox that the checkbox can bind to, and which in turn binds to some model value. When the checkbox is set, the controller sets the model to a certain "on" value, when the checkbox is unset, the controller the model to another "off" value. The other way round, if the model value is modified by something else, then the controller sets the checkbox state to off if the model value is equal to the "off" value, to on if equal to the "on" value, and to mixed if neither of the above. Surely it is wrong to hardcode the "on" and "off" value in the controller code to the ones I use in my only current use of this controller, right?

By the way, I can't help but notice that the latest "Pimp My Code" sessions are not in the relevant section of the sidebar. Also, the "humor?" label doesn't quite work as it does not have the proper URL escape. And there's code of mine that still awaits pimping…

Dave: Depends. There's a whole bunch of stuff that's not life-critical (in the field of embedded software particularly) and that still gets 100% code coverage: there is actually a big grey area between desktop apps and nuclear control systems written in Ada. For instance, the Linux kernel is often used as the underlying OS in many so-called "soft real-time" applications, and some actual RT OSes like VxWorks have in fact support for C++. There are many instances, some of which are not too far off desktop application programming, where 100% code coverage makes sense.

lebpierreÀwanadooPOINTfr

June 05, 2007 11:11 AM

 
Blogger Andrew said...

Thank you wil for another great article and for always expressing your opinions so forthrightly.

June 05, 2007 2:48 PM

 
Anonymous drew said...

Zach:

Please tell me that your dbQuery function supports bind variables. Or at least uses some sort of intelligent input validation, and not PHP's attempt at gpc_add_slashes, or whatever that brain damage is called. Libraries like DB exist for many reasons beyond stroking the maintainer's ego, and one of those reasons is so that developers who lack foresight or knowledge in a particular area can benefit from those who don not.

Brevity is a very good thing, and should be encouraged whenever possible, but not at the expense of basic security tenets.

June 07, 2007 3:50 PM

 
Blogger Wil Shipley said...

Seriously, Zach, Drew is depending on you. Don't let him down. With that binding thing. It's important.

June 07, 2007 3:56 PM

 
Anonymous Håvard Pedersen said...

I took the consequence of this and made the worlds tinies PHP framework. :) http://www.pmeda.no/off

June 08, 2007 12:00 AM

 
Anonymous Basil Vandegriend said...

I think you're bang on about avoiding flexibility until you need it - particularly when it comes to design decisions like adding a superclass. I've found that trying to add flexibility can actually get someone stuck. I call it the reuse trap and wrote an article about it: The Reuse Trap in Software Design

June 09, 2007 1:49 PM

 
Blogger Puiz said...

I saw there were exactly 24 comments to at least two posts, ho I just had to add this one.

Do you tend to agree with what's written here? I'd really love to know. Thanks.

June 11, 2007 8:48 AM

 
Anonymous DavidR said...

When I started programming, back in the days when the PDP-11 was still just a glint in Ken's eye, brevity was certainly top of the list.

In fact, the aptitude test which I had to pass when I applied for my first trainee programmer job consisted of solving problems in the minimum number of steps.

Even during the '80s, I remember reading advice for those programming home computers, recommending re-using single character variable names and removing all unnecessary spaces, line-breaks and comments.

Now we have processors 1000 times faster, and main memory 32768 times the size, and this has removed the need for exaggerated brevity.

I believe the top priority when writing code should be Clarity.

Have you never needed to understand someone else's code? Or your own, six months after you wrote it?

Never mind brevity: spread your code on to several lines, if that makes it easier to read. Use meaningful identifiers, if they make the purpose of your variables and procedures clearer. Add comments, if the operations are not self-explanatory.

You're only going to write the code once, but you (or your successor) may need to read it many times, so it's worth the extra effort.

June 12, 2007 8:52 AM

 
Blogger Wil Shipley said...

DavidR: I'm guessing you didn't start at the beginning of the series, where I make the case for brevity being the new clarity.

I'm not talking about single-letter variables, or about squishing as many operations into a line as possible, or about eliminating comments. I have, in fact, banned all one-letter variables from my company's code.

I'm saying, if you can do the same thing in two lines of code or in one, the one line is ALMOST ALWAYS clearer.

June 12, 2007 12:03 PM

 
Blogger corbin said...

The "(Note to Ali Ozer: IGNORE THIS SECTION.)" gave me a good laugh!

June 25, 2007 4:52 PM

 
Anonymous Dave Schudel said...

I miss the days when we used to be able to write apps whose sizes were measured in K. All these object-oriented languages seem to compile thousands of lines of code-that-should-work.

I agree with Wil - smaller is better.

July 11, 2007 4:49 AM

 
Anonymous Anonymous said...

I have, in fact, banned all one-letter variables from my company's code

Really? What, even i or n for loop counts? That's just verbose.

July 11, 2007 8:07 PM

 
Blogger Wil Shipley said...

No, no i or n index variables. There is autocompletion in Xcode that works really well, there is no excuse for by using descriptive names.

Some would argue that 'I' is a standard, but it is a stupid standard. If you are looping over a set of rows, and inside that loop you loop over a set of columns, which instantly is clear: using 'I' and 'j' or using 'rowIndex' and 'columnIndex'?

If you said "i and j" to be a smartass, then answer this followup -- which is the row and which is the column?

Even if you don't have nested loops using blahIndex is much more self-documenting -- every time you use it, inside the loop or outside, you remember wha it was.

Honestly, your brain doesn't take that much longer to recognize a word than it does a letter. Single letter variables are an artifact from the days of punch cards. Let go.

July 11, 2007 8:44 PM

 
Anonymous Anonymous said...

I would argue that i and j are ok for the very rare cases where a for loop is more useful than a foreach loop. With caveats: i is always to be the outer loop and j for the inner loop; code which uses i and j should be very small (no more than 4 or 5 lines inside the inner loop, and one or 2 lines outside the inner loop). Use of i/j should encompass an idea that the for loop is a single block of finished code.

For example, consider creating an identity matrix in c++:

//create an identity matrix
for (int i=0; i<SIZE; ++i) {
  for (int j=0; j<SIZE; ++j) {
    matrix[i][j] = (i==j ? 1 : 0);
  }
}

(simply i==j should work there, but I feel pedantic)
There is no reasonable expectation that longer variable names will make that any more readable than the single comment I added to the first line. The use of single letter variables is not for ease of development, it is because considering their values is a distraction from the task (to create an identity matrix).

I would also expect to see x, y and z in code dealing with a cartesian system (mathematical formulas could be trivial to use and understand without having to translate from whatever one "expert" developer decides to name something).

July 16, 2007 4:37 PM

 
Blogger Wil Shipley said...

bzero(matrix, SIZE * SIZE * sizeof(int));
for (int rowAndColumnIndex=0; rowAndColumnIndex < SIZE; rowAndColumnIndex++)
  matrix[rowAndColumnIndex][rowAndColumnIndex] = 1;

July 16, 2007 6:36 PM

 
Blogger Adam said...

bazang

even though i dont understand c++, will's code is one line shorter and triangle of dots must be rad

thank you mr SHIPley, i have started and not finished two side projects for random people because i decided to make them 'more fun' to code, putting in lots of flexibility, ultimately making them more difficult to hang together...
what if i do this... damn another model change... two days later: crap i didnt think that it would affect this piece - while i'm here i might as well.... too hard, give up and have broken promise egg on my face

get it done, get the money you were after in the first place and then make it better.

good goodness, wise wiseness

August 13, 2007 2:08 PM

 

Post a Comment

<< Home