Cost and Value of Collecting Data

Collecting data isn't free. There's a cost to every piece of data. There's also value. How do we balance the cost and value of data?

As with all juicy questions, it depends. And, in general, the easier the data is to collect, the less value there is in the data.

Examples of Useless and Cheap Data to Collect

Here are some examples of data that's cheap to collect, and just about useless:

  • Anything at a single point in time, not a trend. Some examples: Lines of code, defect counts of any kind. Any point-in-time measure might be interesting. However, the value is in the trends. (If you're not refactoring the codebase over time, the code continues to grow instead of shrink and grow, etc.)
  • Anything by person, either a point in time or as a trend. Lines of code per person makes no sense. Neither does defects per person. I don't care what approach you use for your project—measuring either lines or defects by person asks for trouble.
  • Velocity, (count story points in some timebox) and the team members don't work together and/or they work on several projects at a time. (If you're wondering why, see Measure Cycle Time, Not Velocity.)

And, in my opinion, timesheets are the best example of useless information.

Why Timesheets are Invalid “Data”

Back when I was a young software developer, I worked on several projects “at a time.” I was supposed to spend various portions of my time on each project.

Back then, I was young and not-so-smart. I worked overtime to try to finish all of it. I filled out my timesheet, with the pretty-close-to-accurate times for each project. The number of hours total? Over 50. I think 55. Lots of overtime.

Later the next week, Finance returned my timesheet to me. “Your totals must add up to only 40 hours per week. Please correct this timesheet.”

I went to my boss and didn't quite say, WTF. I swore a lot less then. However, I did ask this question, “I worked substantially over 40 hours last week so I could do all the work.”

“Thank you,” he said.

“I am supposed to “fix” my timesheet so it only has 40 hours. Which projects do you want me to ignore?”

“You can figure out the proportions, and make it all come out to 40 hours,” he said.

“That's it?”

“Yup.”

I now knew what to do.

Timesheets Are Not Accurate Unless You Count As You Proceed

I still encountered two big problems:

  • How could I possibly remember what I did at the end of the week?
  • My company didn't want to pay me overtime so I had to only report 40 hours.

I could fix the second problem. I worked only 40 hours. My boss wasn't happy. Why was I working fewer hours?

I explained that I didn't work for free. And, calculating percentages for projects? No way. Did he really want me to spend an hour on my timesheet or do more work?

I worked my 40 hours and I worked hard. I didn't realize at the time that overtime makes people stupid, so I lucked out.

However, I still had the first problem. How could I accurately remember what I had done this past week?

I waited until Friday noon to fill in my timesheet. The timesheet was due on Friday at 2pm. Wasn't 2 hours enough time to remember what I had done?

Often, no.

I remembered which days I worked until what time. I rarely remembered what I had worked on, for each project.

I was working in resource efficiency. I had plenty of wait times. I picked up another project then, worked on it until I was blocked there, and then returned to the first one, or the third one.

My work time was random. So was my timesheet. The timesheet was not accurate, especially the more overtime I worked.

(Note: As a manager in a company a decade or so later, I filled out timesheets for everyone in my group. A couple of finance people yelled at me. I asked them: did they want products done or timesheets done? Yeah, no contest there. The product was more important.)

Timesheets or Tickets Don't Work for Capitalization

In Capitalizing Software During an Agile Transformation, I explained the desire for capitalization. My company wanted to capitalize the software effort.

They thought using timesheets would give them accurate data.

The timesheets were easy to collect. The value of the data in the timesheet? Close to useless. Especially when I was supposed to use tasks in my project:

  • Project 1: Planning
  • Project 2: Coding
  • Project 3: Review for Rick.
  • Project 4: Final test with hardware people.

Those tasks were just some of the states I was supposed to track.

What about tickets, such as Jira tickets? (Sorry to pick on Jira. Most of my clients use Jira, even as collocated teams.)

Ticket Systems Try to Track State Changes

Jira tickets have the same problem as timesheets. People have to accurately report the state every single time the state changes.

Here's a question about state changes: What did you eat for lunch and dinner every day this week? My husband and I are unusual—we bring lunch almost every day and we have the same lunch every day. We're boring, and that's our problem. We do have different dinners each night.

Most people might remember what they had for lunch or dinner. They rarely remember both lunch and dinner. And, unless they measure their food, they don't remember how much of what they ate and drank. (Do you know if you eat six ounces of protein? Or eight? Or twelve? That's what a timesheet or ticket is supposed to track.)

Lunch and dinner are just two noticeable state changes each day. And, if you eat with other people, those state changes might have some importance to you.

Do you remember exactly what you ate, at those two relatively important state changes during each day this week?

The longer the time period between recording your state changes, the less accurately you remember. That's why ticketing systems don't reflect reality, either.

Can You Collect Accurate Data?

So, let's assume capitalization is important to you. How can you collect accurate data? You could work in a way that optimizes for capitalization:

  1. Create short deliverables. One- or two-day stories. Weekly or monthly releases if you're not using an agile approach.
  2. Stop multitasking. That allows you to use a team run rate.
  3. Work in flow efficiency. That means you spend the least time waiting for other people.
  4. Measure cycle time. You will know when you release this story/feature to your customer. You're not estimating capitalization.

You can capitalize if you release something. You're not supposed to capitalize until some customer can use what you release.

Just because your data is easy to collect does not mean the data is valuable—or correct. Use your system-based thinking to reason about your system and see what you can measure that's valuable.

One Reply to “Cost and Value of Collecting Data”

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.