Context Matters: Premature Optimization or Habits From Long Ago?

I’m at the Much Ado About Agile conference this week, in beautiful Vancouver. During lunch one day, one of the conference participants started talking about premature optimization of code.

Well, I know a few things about that. When I started to work professionally as a developer, I wrote in assembly language. We had 256 bytes per page. You had to leave a few bytes at the end so you could insert a branch (goto) in case you ever needed to patch the code and insert more functionality, or go to another page. Optimizing your code was a reasonable approach.

I later worked on process control or instrumentation systems, where the software footprint was bounded by the memory in the embedded system–not much, although I no longer remember how many bytes. Since memory was measured in kilobytes, not megabytes then, it couldn’t have been many. Optimizing my code was necessary.

Ok, so that was in the 70’s and early-mid 80’s. Once memory started becoming less expensive and disk drives decreased in size from washing machines to thumb drives, we don’t have to optimize code in the same way.

But there are plenty of reasons that people still optimize. The first is habit. The second is belief that if you don’t build in the optimization at the beginning, you won’t be able to. There must be a third, but it’s too early in the morning for me to remember right now.

We bring our contexts with us, throughout our careers. Sometimes those contexts no longer fit.

When you notice something about the way a colleague is working, try curiosity before labeling that colleague as old-fashioned, or stupid or some other label. My lunch colleague had never realized that people who were still alive (!) worked in assembler, or had to optimize their code, or that linking-loaders actually existed. He thought that was waaay in the past. Yes to the past, but not to the way past.

Our project decisions are based on our context. The challenge is to make sure we are using the context that fits.

Tags: ,
Previous/Next Posts
« »


  1. Ken Estes

    On of my favorite PhD thesis of the late 1990’s was a reexamination of a bunch of optimizations from the 1970’s. These were a collection of “math tricks” which reduced the instruction count of the algorithms and made them run faster on 1970’s hardware.

    However in the 1990’s all of these were DE-optimizations. While the instruction count was indeed less, they were far more complex algorithms and would cause modern (at that time) CPU’s to not be able to predict the branches correctly. The missed cycles from the incorrect branch production was FAR greater then the saved cycles from the clever mathematics. So all the optimized algorithms ran much slower then their unoptimized (and very straight forward) versions.

    Again we see that context (in this case the type of architecture the code will run on) is very important. I would also argue that it does show that simple and straight forward has a chance of being optimal for other contexts that have not been thought of yet, while a designed optimized for one context only fits that one context.

  2. Hans Hartmann

    I encountered this situation in 1998. I was a tester and people were programming in Smalltalk. Footprint of programs were 50 MB in RAM.
    My personal computer had 4 MB RAM. For me it was unbelievable that people were so careless with RAM.
    “If necessary, we optimize later.”
    (Memory leaks were the size of 2 MB / business case enacted.)
    The pogram was not accepted by the users because it would need 20 minutes to run a business case. (Compared with 1 minute when the old program was used.)

    Computers will be more efficient in the future. In two years we will not have a performance problem anymore.

    A short computation. Since 1978, when I worked with my first microprocessor, computer efficiency has been increased roughly by (A European billion) You take in account processor speed, memory and size. Access times have been increasing at a little bit lower rate, but still fantastically.
    My download speed is now 500kByte/s. That is not the fastest, because I still run over twisted telephone copper wires. My colleagues that are connected via cable, have 3 Mbyte/s.
    24.000.000 bits/s roughly. Now, we divide by 110. and wie obtain roughly an increase by the factor of 200.000.
    Still amazing, isn’t it. But it does not suffice. There is a mismatch with all the other increases.
    Latency time of hard disks are still in the same range as they used to be 10 years ago.
    The basic line: there still exist bottlenecks that ask for considerate use of ressources. especially, when the ressources have to be moved from one location to another.

  3. Scott McKay

    Furthermore, what some people think of as “optimization” might simply be careful thinking about good design and architecture. In a project I was working on, I was constantly told by my “youngers” that the effort I expended on trying to get the on-disk data model very efficient waste a “premature optimization”. Guess what part of the system now has performance that drags everything down?

    It’s never premature to plan for great, scalable performance.

  4. John

    This applies to project management as well as software development. Many of our ideas about the “right” way to run a software project come from enormous defense projects from the 1960’s. The waterfall method made much more sense in that context.

    On the other hand, I am continually irritated by how unresponsive desktop software can be. Moore’s law gives and bloat takes away. For example, it takes just as much time to start Microsoft Word today as it did 20 years ago.

  5. George Dinwiddie

    I could tell memory and speed optimization stories, too, but I don’t think that’s the thrust of this post.

    When I’ve changed from mostly using one programming language to another, then I find myself writing code idioms appropriate for the previous language for quite a while. I think most of us do this. I recently saw a 10-year-old Java system that was really C code written in Java.

    The same thing happens in our way of approaching or running a project. For example, in organizations that are trying to adopt Scrum or an other team-based development model, often some manager tries to allocate people as “resources” fitted to the expected needs of the project. They’re bringing with them the “scarce resource” optimization strategies that helped them when they were the central hub parceling out work items to individuals. True teams, however, are expensive to build and get to gel. It’s better to bring a project, almost any project, to an existing highly-functioning team than to adjust the team makeup.

    It’s always a good idea to become more aware of your assumptions and test if they’re true in the present context.

  6. Donald Cox

    Some reflections occasioned by this post:

    If you were practicing TDD/BDD, wouldn’t that eliminate “pre-mature” optimization. Assuming that the practice meant that you only coded what what needed to pass the tests. If there was a performance test, then you’d need to optimize.

    What’s wrong with looking at optimization as a kind of refactoring? If that’s a good way of thinking about it, wouldn’t agile approaches deliver just-in-time optimization (assuming all the upstream activities are working with adequate health)?

    What process metrics would help us become aware of outdated/ill-fitting assumptions? Not completing features in a timebox?

    I suppose it would be emergent in George Dinwiddie’s high functioning team that that they would help each other check assumptions and over time find out which ones were most impacting progress/productivity.

  7. Kent S.

    I too used to program embedded control systems in assembly language and C, long ago. Today, I still come across O(N^2) or worse algorithms in Java that work fine until you get into production with real data. Then it’s time for an O(N log N) or O(N) refactor if you can find it.


Submit a Comment

Your email address will not be published. Required fields are marked *