This is Todd Little's Typepad Profile.
Join Typepad and start following Todd Little's activity
Todd Little
Recent Activity
Axes: As clearly stated in the paper the units of measure of the axes are durations for the Landmark data and cost for DeMarco's data.
Budgets: For the Landmark projects budgets were by annual product spend. This is a significantly different model than project based budgeting often found in cost oriented organizations.
"Great:" Ultimately, to me great is mostly driven by market success.
Math: And your math based on words? Now I'm curious.
Project Management, Performance Measures, and Statistical Decision Making (Part Duex)
There is a current rash of suggestions on how to improve the performance of software projects. Each of these, while well meaning, are missing the means to confirm their credibility. They are many times end up being personal anecdotes from observations and local practices that may or may not be ...
Again it is your conjecture that the projects went over budget. All that can be said is that most projects exceeded their initial aggressive target estimate. To equate that to a budget is incorrect.
Your math skills seem to be failing you as well. How you get to 90% is beyond me. The paper is 6.5 pages and maybe 2 pages are about Figure 2.
If you have data showing how you were able to master the triple threat of great estimation, great high performance delivery, and great market success I'd love to see it.
Project Management, Performance Measures, and Statistical Decision Making (Part Duex)
There is a current rash of suggestions on how to improve the performance of software projects. Each of these, while well meaning, are missing the means to confirm their credibility. They are many times end up being personal anecdotes from observations and local practices that may or may not be ...
Glen,
You use my data and make the conjecture that the projects "didn't turn out as needed." What is your basis for that claim? Quite contrary, in the paper I claim the portfolio was highly successful during that time based on customer satisfaction and market share growth because the project teams were focused on maximizing value and less concerned about overrunning aggressive target estimates.
I realize that your business domain may have different drivers that emphasize cost.
Project Management, Performance Measures, and Statistical Decision Making (Part Duex)
There is a current rash of suggestions on how to improve the performance of software projects. Each of these, while well meaning, are missing the means to confirm their credibility. They are many times end up being personal anecdotes from observations and local practices that may or may not be ...
Glen,
You have already acknowledged some of the errors of your critique in other postings http://herdingcats.typepad.com/my_weblog/2014/06/how-to-lie-with-statistics.html , but I feel I must also call it out here as this posting has not been updated. Let’s review some of your specifics here:
1. Data is self-selected since only 120 projects were used out of 570. The paper that you are referring to was a preliminary version of my IEEE Software publication “Schedule Estimation and Uncertainty Surrounding the Cone of Uncertainty.” http://toddlittleweb.com/Papers/Little%20Cone%20of%20Uncertainty.pdf In the IEEE version I make it clear that the projects included the entire commercial portfolio. Even in this paper I make it clear “Of these 570 projects, 120 projects were commercial releases for the general oil and gas market. The remainder included currently active projects, internal projects, and non-commercial releases. For the purpose of this study, only the 120 commercial releases were considered.” I don’t see this as self-selection, but rather as a form of Calibration. Obviously it would not make sense to include currently active projects or projects that started prior to the time window. I also excluded internal projects and non-commercial releases because I did not consider them to be comparable to commercial software releases.
2. Without Root Cause the data is meaningless
Totally disagree. The data tells us quite a bit about the patterns of the organization. There is a clear optimism bias. There is a large spread in the ratio between the Actual and the Initial estimate, with a P90/P10 ratio of 3.25. This tells us that we have a good reason to suspect that future projects under future conditions are likely to fall into the same patterns. As a planner making decisions this is valuable information.
3. The charts show Ordinal numbers
Sorry, but that is just flat wrong. The charts are showing Actual days vs. Estimated Days. These are not Ordinal numbers at all. Neither are they Cardinal as Cardinal numbers are strictly integers.
4. It is wrong to draw the “Ideal” line.
Call it what you want. I used “Ideal” because I was comparing to other published data. Others have used “Perfect Information.” McConnell used “Perfect Accuracy.” It’s a valid comparison. We want to know how the scatter deviates from this line and whether there is a bias. The cumulative distribution functions that I show in the paper provide additional visualization of the data scatter.
5. The data is not Calibrated
This is just not true. The data that I presented was calibrated against data from Tom Demarco and showed almost identical behavior. I also calculated the Estimation Quality Factor that DeMarco proposes and used it as an indicator to determine whether the estimates were in line with other data.
How to Fib With Statistics
Todd Little and Steve McConnell use a charting method that collects data from projects and then plots it in the following way. For Little's data its the initial estimated duration versus the actual duration. and for McConnell's data it's the estimated completon date versus the actual completio...
Glen,
You have already acknowledged some of the errors of your critique in other postings, but I feel I must also call it out here as this posting has not been updated. Let’s review some of your specifics here:
1.Data is self-selected.
This is not true and you acknowledge it here http://herdingcats.typepad.com/my_weblog/2014/06/how-to-lie-with-statistics.html in this posting. The paper that you are referring to was peer reviewed in IEEE Software: “Schedule Estimation and Uncertainty Surrounding the Cone of Uncertainty.”http://toddlittleweb.com/Papers/Little Cone of Uncertainty.pdf I make it clear that the projects included the entire commercial portfolio. The portfolio included other projects that were started prior to the time window, currently active projects, internal projects, and non-commercial releases. I don’t see this as self-selection, but rather as a form of Calibration. Obviously it would not make sense to include currently active projects or projects that started prior to the time window. I also excluded internal projects and non-commercial releases because I did not consider them to be comparable to commercial software releases.
2. We all know that subject matter expertise is the least desired and past performance, calibrated for all the variables is the best.
I don’t know that I agree with that as a fact. Magne Jørgensen has done quite a bit of research in this and I don’t know that he would agree either. What is the basis and solid statistical data that backs up your claim? My view is that subject matter expertise and past performance should be used to guide both estimation and an understanding of the uncertainty bands of the estimate.
3. The Ideal line is not Calibrated.
Sorry, but there is no need to Calibrate the Ideal line. By definition it is the identity.
4. The initial estimate data is not Calibrated
This is also not true. The data that I presented was calibrated against data from Tom Demarco which showed almost identical behavior. I also calculated the Estimation Quality Factor that DeMarco proposes and used it as an indicator to determine whether the estimates were in line with other data. The conclusion from the EQF analysis was that the overall estimation process was better than average.
5. The second chart is a much better chart.
I agree it looks pretty. But you carelessly forget to mention that it is a Log-Log plot while my chart is Cartesian. Lots of data samples look pretty on a Log-Log plot, it's a really nice way to “Lie with Statistics.” You also carelessly forget to mention that it is comparing 2 known quantities: Actual duration at the end of the project, and a calculation based on parameters including Actual LOC. This removes the largest unknowns from the early stages of the project which are what we really care about. So instead of estimating the duration or effort of the project, we now have to estimate LOC. Have we really gained anything? And is COCOMO really that great? In the 7 data points that were supposedly used to validate http://csse.usc.edu/TECHRPTS/1982/usccse82-500/usccse82-500.pdf the Cone of Uncertainty, (which ironically was 7 USC student teams doing an implementation of COCOMO) the estimates from COCOMO were so horrendous that they were not even included in the analysis. The excuse given in the paper was that COCOMO and other models were known to be inaccurate for small projects.
How Not To Make Decisions Using Bad Estimates
The presentation Dealing with Estimation, Uncertainty, Risk, and Commitment: An Outside-In Look at Agility and Risk Management has become a popular message for those suggesting we can make decisions about software development in the absence of estimates. The core issue starts with first chart. I...
Glen,
You are doing an excellent job of demonstrating “How to Lie with Statistics” though your own actions. You are making claims about my article which are absolutely false. The article (which can be found at http://www.toddlittleweb.com/Papers/Little%20Cone%20of%20Uncertainty.pdf) does in fact talk about some of the causes for the deviation from “perfect” or “ideal” estimates. (The Top Ten List in the presentation is for entertainment)
■ optimistic assumptions about resource availability
■ unanticipated requirements changes brought on by new market information,
■ underestimation of cross-product integration and dependency delays,
■ a corporate culture using targets as estimates,
■ customer satisfaction prioritized over arbitrarily meeting a deadline.
You seem hung up over the use of “ideal” in the chart. The comparison of actual to estimate is what the study was about. Other studies which I was looking to benchmark against used the same comparison. Ideal is representing perfect estimation. I agree that if this chart is taken out of context then it is subject to misuse. I would also contend that you are guilty of taking it out of context.
You also make that claim that my data is somehow flawed because it is “self-selected.” I do not know how you are coming to that conclusion. As stated in the paper, “The study reported here looked at three years of data from 1999 to 2002, during which Landmark collected data weekly about
all 106 of its software development projects.” My analysis included all data in the commercial portfolio. The only selection criteria that I used in analyzing the data was that the project started and ended within the study time, and that the project delivered commercially. If you are contending it is flawed because it is only one organization then I do in fact acknowledge that in the paper. In order to determine the usefulness of the data I did significant benchmarking to other published data to see how the data compares. What I found was that based on the EQF metric of DeMarco that our estimations over time were slightly better than the industry norm. I find EQF a much better indicator than the chart that you object to as it effectively includes an integral of the error over time. Nonetheless, I stand by the value of the chart when taken in context of the entire paper.
Realize that this was a study of the Cone of Uncertainty. As I mentioned in my previous comment, the original publication of by Barry Boehm had ZERO empirical data to support it. It was properly acknowledged to be subjective. It somehow then became empirically validated by a whopping total of 12 seriously questionable data points, which I consider to be the epitome of “self-selected.”
There are valid aspects to the cone which I do in fact acknowledge in the paper. However, I have also seen the cone misunderstood and misused far more often than I have seen it used properly.
How to "Lie" with Statistics
The book How To Lie With Statistics, Darrell Huff, 1954 should be on the bookshelf of everyone who spends other peoples money for a very simple reason. Everything on every project is part of an underlying ststistical process. Those expecting that any number associated with any project in any do...
Glen,
Thanks for the excellent example of “How to Lie Without Statistics.” You present my graph and take it entirely out of context by saying that is shows no indication of root cause analysis. Of course it does not because that is not what my research was about at all. I was taking a fresh look at the Cone of Uncertainty to see what some real empirical data from a commercial software organization would show. As you should be well aware the Cone of Uncertainty is a visual representation how estimates or forecasts converge toward the actual over time, precisely why I started my investigation with an analysis of actual versus estimate. I was also replicating a chart by Tom DeMarco in “Controlling Software Projects.” You will also note that another chart of actual versus initial estimate is presented in Steve McConnell’s “Software Estimation” book. All three of those charts those charts show nearly the same pattern.
Of course the Cone of Uncertainty itself is another prime example of “How to Lie Without Statistics.” The original graph presented by Barry Boehm was self-proclaimed to be subjective, i.e. not empirically derived, i.e. without statistics. A later version of it presented a bit of data, a grand total of 12 data points from 12 projects. But a deeper dive into a those 12 projects reveals even more. There were actually only 2 projects. The first project was a student project at University of Sothern California to develop a version of the COCOMO estimation tool. This was performed by 7 teams. The estimates that were made for these projects were themselves done using COCOMO and the results were so terrible that they were not used in the analysis. As near as I can tell, the data plotted on the cone chart was the scatter of the actuals for the 7 teams normalized by the mean of the 7 projects. The other project was for the US Air Force and represented 5 estimates, again likely normalized by the mean. There is no indication that any of the 5 teams that estimated actually delivered anything.
There are lots of great examples of “How to Lie Without Statistics” as well as “How to Lie With Statistics.” I stand by my chart and my paper. I used real data from a real commercial software development organization. The chart is a valid comparison relative to a baseline. If you have real empirical data to support your conjectures I would love to see it.
How to "Lie" with Statistics
The book How To Lie With Statistics, Darrell Huff, 1954 should be on the bookshelf of everyone who spends other peoples money for a very simple reason. Everything on every project is part of an underlying ststistical process. Those expecting that any number associated with any project in any do...
Todd Little is now following The Typepad Team
Aug 16, 2015
Subscribe to Todd Little’s Recent Activity