Applications of Software Measurement and Software Management 1999 Conference Notes
I used to attend Agile and quality related conferences years ago and have saved notes from many of them. While some topics have become overtaken by events since then, there still seem to be some useful ideas that I’d like to pass along.
I have attended three prior ASM Conferences (’94, ’95, and ’96), missing the last two before this one. They tend to attract many well-known figures in the software measurement/metrics field. The Software Management Conference has apparently done the same in its, shorter, history. It was decided to combine the two this year since the latter (management) has (or should have, in the view of many involved) a considerable dependence on the former (measurement and metrics). Based on the experience in preparing for the event and the response shown by the combined pre-registration (about 800), it was decided to meet jointly again next year (March 6–10) and in the same location.
I would hope to be able to present either a tutorial or session (if not both) next year as well as, perhaps, become involved in organizing the event, i.e., running one of the technical tracks. Since I know many of the people involved and since I had not seen some of them for almost two years, there was some discussion about whether presentations or committee work would be best or if I could do both (as a couple others had). The value of working on the planning and presentation of the event is the broader exposure to the presenters and speakers which this affords as well as the chance to affect what topics/trends are emphasized.
Attendance at the event, more than tutorials and sessions, is valuable due to the access to speakers and presenters which even the otherwise uninvolved attendee has. Unlike some conferences, ASM (and SM) require speakers to stick around to the fullest extent possible and mingle with the audience at lunches, breaks, and during defined “Meet the Experts” sessions at the end of several days during the Conference. There is usually an “experts” panel of one sort or another where considerable audience interaction occurs. Thus, I always find it a very good opportunity to “consult,” as it were, with some of the key figures in measurement and metrics.
I did not choose to attend any of the tutorial presentations this time. Much of the material was relatively familiar and, in the one case where it was not, there were also to be a few presentations on the same subject area. In this case, that area was orthogonal defect classification which is an alternative approach to defect causal analysis than the traditional root cause analysis meeting approach usually taken. I also happen to have known one of the speakers well since he is in the Applied Research Area at Bellcore now and was a colleague of mine in Bellcore’s development area when we both did technology transfer work several years ago. (There’s more on this topic below, though.)
I would have expected some Atlanta area companies to send a person; however, it seems to have just been the Air Force and me present.
I’ll describe the rest of the Conference on a day-by-day, session-by-session basis and try to highlight what I think were overall Conference key points in the process. There were several keynote addresses and some were very interesting. I found many of the sessions I attended less so; however, as I have learned is often the case, the speakers had some very interesting things to say once you got them off their prepared materials. This, of course, was how I often had to operate when I was at Bellcore given restrictions on printed material that had to be reviewed by the corporate attorneys. They actually did not seem to mind that I would say some things which they did not want in print on a viewgraph or submitted paper. It was then just my word against 40–50 other people in the room which did not seem to bother them!
[I have the proceedings of the Conference which has most of the presentations in it. None of the formal papers are in the printed proceedings, though, nor are the tutorials which were an extra cost on Monday and Tuesday. I wanted to be present on Tuesday, though, to meet with a few people not otherwise involved in the tutorials, but running the program.]
Keynote Address by Steve McConnell on “Software Project Survival”
McConnell is relatively well-known for several books he has had published by Microsoft Press (e.g., Code Complete, Rapid Development, Software Project Survival Guide) and is considered to be an advocate of “practical” approaches to software development management/process. His presentation provided a variety of industry surveys regarding the %s of projects that fail or fail to deliver and the reasons given for this. One such survey (Standish Group) claims 25% of projects that are cancelled do so while in the integrate-debug-fix cycle at 100% of schedule/budget already spent due to the perception that quality problems are just not going to get solved acceptably. I think his main point, though, was that many projects which do complete, do so also with reduced functionality from what was originally expected (only 25–50%). Again, it is integration problems which begin to surface the quality concerns. (This fits well with comments from Boris Beizer, a testing “expert,” in his ASQ Software Quality Professional premier issue article last month, by the way, where he, too, identifies that it is not the sorts of things Unit Testing is likely to find, but design integration issues which will begin to cause a project problems.) McConnell noted that schedule gets a lot of attention in most projects because it is the easiest to measure and hardest to hide: time marches on quite visibly compared to unpaid overtime (budget) and functional reductions. Project managers often press on, though, perhaps due to ignorance of the limits which measurement data can document or the heroic belief that their project is “different” (i.e., better able to succeed) compared to others that have not done well. [It’s too long to repeat here, but remind me to tell you McConnell’s story of the pilot, hunters, and moose as a parable in project optimism.] McConnell’s final main point was to note that Microsoft (as anti formal process as most anyone might imagine) and NASA Goddard’s SEL (S/W Engineering Lab) (which won an IEEE Software process prize in 1997), both say that having “senior” (i.e., experienced) personnel as the core of any project staff is _the_ critical factor.
Pamela Geriner (Lockheed-Martin) on “Statistical Process Control for S/W”
I know Pamela a bit from the SEI SEPG Conferences and her colleague, Beth Layman, is a former independent consultant and member of the QAI staff in Florida. I’ve also heard others talk on SPC applied to software, so I wanted Pamela’s perspective as a statistician. Her basic message was to use tremendous caution in applying traditional statistical methods to S/W, i.e., it was possible, just use care. Some of her reasons were that traditional methods, which use regression analysis, don’t handle time series data well when it is known that a declining curve is expected as in defect discovery and repair. Also, there are data homogeneity assumptions in traditional SPC which often do not hold for software — again defect data since not all defects are equal in complexity, cost, etc. Thus, stratification of data is an important matter in software data analysis is the results are to be used to drive and serious decision-making.
ASM Keynote by Norman Fenton on “New Directions in S/W Metrics”
Fenton is a researcher from the UK (Centre for Software Reliability) and a well-known/respected metrics authority. I will want to look into what he had to say in more depth since it is an area in which I have little or no background: Bayesian Belief Networks. Fenton states that Microsoft uses a tool based on this (even supplying it to R&D companies, not commercial ones). But there seems to be a commercial version available as well. Fundamentally, Fenton has problems with defect density measures because they are not tempered by usage profiling which would allow confidence that the data shows the density to be a reliable representation of defects actually present in the software. With no usage profiles, it is not possible to tell how many latent errors really exist that would matter. Fenton claimed that his research suggests a very small % of defects have any serious impact in the field and that 35% of all defects have a mean time to failure of hundreds, if not thousands, of years!
Vicki Walter (Tellabs) “Total Cost of Ownership”
Tellabs is a telecommunications H/W and S/W vendor and Walter’s presentation described their total cost of S/W product quality program. It was all predicated on realizing that low (long-term) cost can be measured and estimate and used to make better buying decisions that a low price/bid approach. Tellabs, as many other telecommunications industry vendors, is driven by telephone company models which do just this sort of calculation to determine what vendors are most desirable long-term partners.
Keynote Address by Tom DeMarco on “Management (The Stuff They Don’t Teach You Anywhere)”
DeMarco, along with a few other speakers and authors, like Tim Lister and Gerry Weinberg, usually has something memorable to say every time he speaks. One need not always agree with everything to recognize that such a speaker/writer has a good way of making clear, at least, what they are trying to say. DeMarco’s main point in this talk was that, unlike technical folks, who are introduced to a project team and often given some defined chunk of a project to work on to learn a new skill or system, managers get promoted into their jobs, are given a Microsoft Project management course, and sent on their merry way with little or no mentoring or true management “team,” i.e., one that truly shares responsibility for success among managers on the team. DeMarco says that management training is woefully inadequate, defining “training” as “doing something slowly that an expert does quickly.” The management environment for learning the job of management rarely has this slow introduction to the effort. DeMarco claims this is disastrous since management’s real job is not addressing problems of scheduling and resource balancing, but forming and supporting effective people and teams. (DeMarco and Lister are known for their book, Peopleware, which addresses the environment needed to support and develop people and how companies, bolstered by accounting and Furniture Police rules, get in the way of such things occurring.) Among the skills managers need for fostering such an environment are effective hiring practices, staff retention, conflict resolution, praising and thanking, listening, and trusting which, as DeMarco’s talk’s title implies, is “stuff they don’t teach you anywhere.”
Keynote Address by Bob Grady (H-P retired) on “Insights Into S/W Project Management”
Grady was at H-P for many years and has written three well-regarded books on the measurement program put in place at H-P and how it was used over the years for process and product improvement. Much of his material was based on his books and on some “rules of thumb” measurements that projects at H-P could use to “kick-start” their project data estimation and collection efforts based on corporate historical averages. For example, a testing model used was that 25% of all (pre-release) defects are found at the rate of 2 hours per defect, 50% at 5 hours/defect, 20% at 10 hours, 4% at 20 hours, and 1% at 50 hours. A rough defect history data collection effort can be translated into a rough defect repair effort calculation and, correspondingly, schedule impact. H-P has also used such data to establish initial targets for Time To Market release decisions and has used the data to get marketing and sales folks more actively involved in quantifying the sources of their demands for schedule and prices. As Grady suggested, it wasn’t the H-P model per se that should be of interest, but that any company could likely produce something similar and develop into more sophisticated form over time. [Quoted several times during the Conference was George Box’s comment that “All models are wrong; some models are useful.”]
The Orthogonal Defect Classification Sessions
This is probably as good a place to summarize the ODC approach as any since it was the occasion of the first talk on the subject. Very simply, rather than take the traditional, and labor-intensive, approach of root cause meetings and defect-by-defect identification of the root causes, ODC uses a more measurement approach to make initial cuts at data using comparisons between similar projects. (What “similar” means has a lot to do with the individual organization and the level of data collected, of course.) Assuming one’s defect tracking system has categories for problems which developers and others who report defects are expected to fill in, this data, on a group by group, or product by product, basis is summed, turned into percentages, and compared to the data from “similar” projects to see if immediate anomalies appear. Anomalies would be percentages of defects in categories that differ by some “significant” amount from other “similar” projects. This might begin to show how different products or development groups might be having difficulty with certain technology or methodology issues. (Groups can, of course, be compared to their own history, but the ODC research seems to suggest comparison to other “similar” projects reveals more since a group that has history together is likely, time after time, to “look like” itself, of course, revealing few, or no, anomalies.) The main ODC advantage is that much of this analysis can be done without involvement of any development group personnel, bringing them in to validate/discuss the findings after much of the initial effort has been completed through relatively low cost summarization and analysis efforts which can often been automated from the corporate tracking systems. Bellcore, who spoke on their experiences using ODC, generates the output summary data and presents it through web pages that the groups can view and discuss, even remotely, with the corporate data analysis staff. [Should their be interest in this, IBM, who originated this method, has a web site on the subject, and I can pursue some of the speakers for more details.]
James Robertson on “Managing Customer Requirements”
Robertson is a UK colleague of Lister and DeMarco and tries to focus on the process of eliciting accurate requirements through identifying business “events” and what they mean to the system being proposed. His approach precedes use-case analysis and other formal requirements definition methods. One of the focal points of his approach is to begin with identifying who stands to gain power and lose power because of the new system. Both groups should be stakeholders in the requirements elicitation/definition process and formally recognized as such in carrying out requirements elicitation and analysis. Those who otherwise “gain” or “lose” should also be part of the process — beside those identified through formal life cycle methodology expectations. Robertson also discussed “Fit Criteria” which he identified as a “quantification of a requirement that makes it testable.” Through the Atlantic Systems Guild web site, he offers a requirements elicitation/definition template called Volere which is available free for downloading.
Tim Lister on “Measuring Benefit” (rather than quality or productivity)
Lister suggests that a lot of projects default to schedules as the driver for their work because they have no identifiable (or really strong) benefits to offer. Hence, Time To Market (the schedule pressure in disguise measure for many) becomes the default “measure” of the project’s “value.” Where budget is held up as a key driver, Lister suggests lots of people do the cost/benefit analysis without really bothering to do the latter. [One area where this is done routinely in business is in planning work environments such that the cost, measured in various utility and construction savings, avoids determining how the resulting workplace will affect worker behavior/productivity. This is a key theme of Lister and DeMarco’s Peopleware.]
ASM Keynote by David Card (S/W Productivity Consortium) on “SPC for Software Engineering”
Card, formerly of Computer Sciences Corporation and the NASA S/W Engineering Lab, is also part of the US TAG Task Group (heads it, in fact) which is developing software measurement program standards. His main point was, like SPC for manufacturing, SPC for software is about trying to reduce process variation by establishing limits for various process measures, within which management does not intervene in the process, but, outside of which, management looks into what might be happening. Many years ago Deming identified this as important to avoid any “tinkering” by management with projects, i.e., responding too rapidly and at too fine a level of detail which simply introduces, rather than removes, variation. Formal SPC assumes process stability which software projects might never achieve, especially if they are working at the edges of technology the way many do. However, SPC techniques have been successfully applied by many non-manufacturing disciplines, including medicine, advertising, education, and utility service provisioning.
Panel on “The Software Triangle: Technology, Process, People”
This “triangle” is a favorite graphic in much literature and, perhaps, first appeared in some of the Software Engineering Institute’s early Capability Maturity Model work. The discussion/debate between Tom DeMarco, Bob Grady, Steve McMenamin (S. Calif. Edison), and David Card was focused on how valid/useful this “model” of dependencies between s/w development factors was. After 2–3 minutes of opening commentary by each panelist, the floor was opened to the audience for questions. Card suggested is was more of a chain, adding business focus as a fourth link, which pulls a project up the risk/benefit slope and tries to keep it out of the “pit” of chaos. McMenamin suggested “confusion,” more than anything else, kills a project, not technology, process, or people matters. By this, he meant failure to achieve clarity of purpose, decisions, and assignments within the organization. During the course of the discussion, McMenamin suggested some questions to ask about project clarity: What decisions have been made? What decisions have yet to be made? Who will make these decisions? He also stated that project “urgency” should not be measured only as time/schedule pressure, but the importance (Lister’s “value”) of the work, i.e., who _really_ wants to see the work get done and how badly?
SM Keynote by Brian Lawrence on “Tales of An Expert Witness”
This talk was about Lawrence’s experiences in court as an expert witness in lawsuits over software breaches of contract and fraud situations. His role was to inform the court, often paid by one or the other side, of what is considered “common industry practice.” His work has always been as a consultant known to the court and both sides, and all the data he is permitted to see is data that both sides must be able to see (so his own clients often do not show him everything since they would have to show the adversarial side the same information). His main point was that, right now, “malpractice” is not a likely scenario in such cases since there are no widely accepted legal standards for software. Texas has begun to institute certification of software professionals, so this may not be far off, though. Lawrence did say that “negligence” is possible since it means “reasonable standards being ignored.” Lawrence’s opinions in court help establish what is considered “reasonable.” Lawrence is not a lawyer, but a lawyer who works in software situations was present, Cem Kaner, and he supported what Lawrence was saying. He also noted that the Better Business Bureau had recently stated that software had overtaken used cars as the source of the largest customer dissatisfaction. One of Kaner’s activities for several years has been to be a consumer advocate in the face of software industry efforts to change the classification under which over the counter software is sold from a product to a leased item, formalizing the shrink-wrap licenses which are not meaningless as a legal standard. However, if the revision to the UCC (Universal Commercial Code) goes through (voted state by state), it will be very hard for anyone to make a claim against over the counter software for its failure to perform. [I asked Cem about actual over the counter cases since it seemed there were few or no reports of such legal actions. He said this was true because companies tended to settle out of court once things looked bad for them since the law is against them right now. What keeps most people from pursuing problems is the high cost of litigation against wealthy software firms. He did say that Microsoft and Gateway had been involved in lawsuits and settled out of court.]
David Herron on “Trials and Tribulations of Measurement”
This was really a session about how software (measurement) professionals should always think of themselves as consultants within their own companies and act accordingly, considering who their customers are and how they can best present their work. His presentation talked about: defining your space (what is your area (or areas) of expertise), creating a vision (of what you what to project as a professional), performing effective planning (or your work and what value it will deliver your customer(s)), listening effectively, communicating your expertise (in a way management can see positively affects their interests), and collaborating effectively (especially with your peers as a team for mutual benefit).
After the formal end of the Conferences, there was a Friday afternoon manager’s “seminar” open to all that discussed making use of the material presented during the week. Once of the folks working on the Conference committee (an independent consultant) presented the seminar. I have a feel it is something that could be even more useful if it had had fewer up front assumptions about the course the discussion would take. But it was designed to suggest the kinds of metrics and data collection activities that would be worthwhile and that could be “sold” to one’s own management as well as to development staff, making the cases for “What’s In It For Me” for both groups. The discussion was relatively calm, but interesting. I felt there should have been a session like this to prepare new folks for the week as a preparation to ask questions of speakers and tutorial presenters. I was able to being the discussion around to a couple of my measurement hot buttons: leaving developers alone as much as possible and using the data you already collect. Both ideas seemed to resonate with many of the managers present who had hit walls trying to do developer-based metrics and “invent” new data collection efforts when they already had schedule/milestone, defect tracking, and test effectiveness data. These three can be the basis for significant initial measurement success and no developer has to do anything new or different and it avoids the LOC versus Function Points battle.
That about covers the conference. I have the Proceedings and would be glad to talk to anyone in further depth about any sessions as well as pursue speakers for more information and their web site addresses (many of which I know are buried in the Proceedings material).
McConnell (firstname.lastname@example.org) talk: How much are problems due to ignorance or problems and history versus belief that “we can do it (better) (than they did (could))”?
Panel on Software “Triangle” (people, technology, process)
As one of the participants, David Card offered this diagram:
McMenamin offered this diagram:
And said one should ask:
· What decisions have been made?
· What decisions have to be made?
· Who will make these decisions?
Noting that “urgency” is not just time/schedule pressure.
Ask developers how much of their time is “wasted” by rework/other’s errors/meetings/etc.
Idea for future conferences (not just this one)
How about a “What can I get out of this week?” session at the start of the program rather than a “What should I have gotten out of the week/” session at the end as Robin Goldfarb did?
Each speaker (keynote or track) should state what an attendee ought to get out of their session in one sentence (in abstract or at least at outset of session). Then presentation should be making that point (or points).