I read Metrics with Impact by Michiel van Genuchten and Les Hatton in the July/August IEEE Software, pp 99-101 last week. They discuss a metric: Compound Annual Growth, CAGR for software.
CAGR is interesting to me, because I've actually measured it before. Here are some graphs (no numeric data) to describe what happens in different kinds of projects.
 In a waterfall project (any project where you wait until the end to integrate and fix defects), where you only learn about problems and fix them at the end, code growth follows a typical S curve: you build a lot of code at the beginning, less code at the end. Now, note the dashed line.
In a waterfall project (any project where you wait until the end to integrate and fix defects), where you only learn about problems and fix them at the end, code growth follows a typical S curve: you build a lot of code at the beginning, less code at the end. Now, note the dashed line.
If you give the project enough time at the end of the project to fix the defects, you can reduce the code size *in the projects I measured* by about a third of the code base. These were not large programs. They were programs I would describe as small to medium size, 3-8 teams in size. They were only in the projects I measured, and it was a long time ago. I no longer have the raw data.
I've done assessments since then—and I don't have access to the raw data—where that dip never occurred. The code base never decreased in size.
In the assessments I've done, the code growth has followed the S-curve, except when people have copy/pasted portions of the code. In that case, the code grew exponentially. I recognized that because not only did the code base never shrink, the Fault Feedback Ratio got so large, the developers could not make progress.
 However, in agile projects, I don't think code growth follows the S-curve. But I don't have the data. And, that's where you can help.
However, in agile projects, I don't think code growth follows the S-curve. But I don't have the data. And, that's where you can help.
I believe, because of refactoring, that agile projects and programs follow a different code growth pattern. That's only if the developers refactor. I believe they follow a pattern more like this one, where because the features are small, the code base grows more slowly and because of refactoring, the code base is as likely to shrink as the developers add features, as it is to grow. Well, it does on my homegrown projects 🙂
Michiel and Les have just 10 data points on large programs, and none of those programs are agile (I think). Their measured Compound Annual Growth Rate, CAGR is 1.16.
Knowing the CAGR is useful, because it allows you to predict hardware footprint and supports estimation. It's at least better than trying to take the unknown requirements and trying to estimate them!
If you are interested in providing me your data, I am willing to write the IEEE Software article. Here's what I need: how many people are on your project, what agile means to you (do you release every iteration, etc.), what your CAGR has been since you went agile, and your contact information. If you prefer to be a coauthor, we can discuss that too.
I'm interested in this, because this could be a potential measurement for large programs. Maybe.
Let me know if you want to help. Maybe discuss this in the comments. Maybe I'm asking for not-enough-data to write an article. All I know is this: I can't do this without you. If we want to know how agile software grows, we need data. And, in my opinion, we need data that separates different kinds of agile programs into these buckets: up to three agile teams, 4-8 agile teams, 9and above agile teams. Or, something like that. Maybe just the number of people on the program will do.
Thanks for your help!
Hi Johanna,
I don’t have the data you seek, but am interested in the results you collate.
Agile & Waterfall projects both have 1 thing in common: people. The competency, collaboration & synergy of these people in the different project stages may influence key data measures.
Thanks for pulling together the research – I’m interested in seeing more.
Andrew, well, that makes at least two of us! I also wonder about the people. That’s why I wonder if I’m asking for enough data. We’ll see.
Has any reader been anywhere that has a CAGR of 0 +/- 0.2 ? What I mean is: retiring systems and replacing software, as well as adding them.
Bob, I have no idea. What a good question.
Hi Johanna,
I have data from our data warehouse system that may be useful. The system has been in production since 2006, initially developed using waterfall methods. From late 2011, we initiated a number of development method changes, including establishing agile practices. I have several measurement points between Sep-2011 and Jul-2013 against a set of our code. My original focus to introduce code complexity measures to SQL code base.
Chris, wow, I’ll be in touch by email. You actually have comparison data! Cool.
Dear all, since 2010 Les Hatton and I run a series of columns in IEEE Software about the ‘Impact’ of sw on companies, industries and society. The columns are written by technical people and managers. One goal is to get a better quantitative understanding about the impact of sw so we always ask people the size and volume (# of users) of the sw product. We did 20 so far with contributions from Airbus, the Mars Rover, MS, Oracle, an open source university team, Hitachi, Tomtom and so on. The columns are 3-5 pages and contain real data.
Playing with the data we found to our surprise that the typical growth rate of the sw was around 16 percent per year (100 KLOC in year 116 KLOC in year two, 135 KLOC in year 3, aso) for many of the products, despite the huge differences between the products and the teams.
Some of the projects from which the data is coming are characterized as agile by the authors, like the YAWL open source project at the Uni of Queensland (Column from 2011) and the navigation system from Tomtom (2011). Some others are not. It is interesting to know what the impact of agile over time on the growth rate is, if any. We do not have all the answers, we are learning and are interested in more data and columns for 2014 and beyond.
Just let me know if you have more questions. Michiel