ASQ Software Division 12th ICSQ (2002)

Scott Duncan
12 min readMar 5, 2022

--

I used to attend Agile and quality related conferences years ago and have saved notes from many of them. While some topics have become overtaken by events since then, there still seem to be some useful ideas that I’d like to pass along.

DeMarco Keynote, “Secrets of the Agile Organization”

Tom DeMarco delivered the Conference’s opening Keynote address that, despite its title, was about allowing slack time rather than striving for 100% productivity, i.e., everyone busy 100% of the time, especially on projects not worth spending any time on at all. He defined “slack” as “the degree of freedom (in time and budget and manpower and space, etc.) necessary to make change possible.” DeMarco advocated slack, in particular, to allow people time (and resources) for professional growth that he feels is a primary way to avoid losing talented people and positioning an organization for rapid change. An organization, which cannot change quickly, is characterized by over-adaptation to the status quo (i.e., being very good at doing things one way), fearful of change (i.e., because people fear being expendable), and extremely busy (i.e., resistant to change and unable to change because everyone is “so busy”). Indeed, he called “busy-ness” a “pathology” largely due to an inability to appropriately prioritize work. (Later, Tim Lister, a partner of DeMarco’s in Atlantic System Guild, would talk about it being needing to have a process to determine what projects are worth doing as well as one to do projects.)

DeMarco felt that the problem organizations — especially software development ones — have with prioritization is that it usually means some loss of control. That is, if they don’t do almost all projects for all customers, they fear losing that work to other sources and, hence, losing some level of control over those customers. But, DeMarco claims that, if you want to get fast, you have to do less, i.e., allow for slack so you can respond to new, and more critical, work. (Of course, one can take on everything, prioritize and then keep pushing projects to the bottom of the pile, which does retain the control, but at the loss of credibility of the organization over customer dissatisfaction with promises (implied or direct) made and not met.) The process side of “slack” and ability to react to change comes in the new, “agile” methodologies focused on skill-building for the staff and flexibility in using processes when and where needed, i.e., a tailorable methodology which provides guidance and opportunities for projects to select from a set of processes to form workflows that match project size, complexity, etc. (DeMarco, however, pointed out that this meant processes that were easy to understand and apply but still defined and documented but not with a “Victorian novel,” advocating instead a 1-page-per-process approach.

DeMarco offered an example of adapting process (and having, in fact, a process from removing processes) using code inspections. Since he feels the more significant problems a project has come through inadequate requirements and design errors, DeMarco suggested eliminating code inspections in favor of design reviews. (A speaker later in the program suggested eliminating unit testing instead since peer review seemed a more effective approach to removing defects compared to unit testing (at least the way unit testing is done, or not done, in many organizations).) DeMarco noted, however, that the problem organizations have moving to design reviews and away from code inspections is that they don’t do a very deliberate job at creating designs until after the coding is done. What this does, usually in the rush to get a project schedule accomplished, is ask programmers to code from requirements which is almost always a mistake since they then, individually, make up their own designs and often differently for the same things. Then one runs into interface problems between parts of the system that must be corrected through rework sometimes after being caught by the customer.

Having finished the topic of making change faster, DeMarco returned to the theme of losing power to discuss why projects are harder today than they used to be. His first point is, obviously, that all the easy stuff has already been done (though it keeps getting redone because it seems easy though a “buy not build” approach would be even easier, if an organization had a good requirements and prioritization process in place). DeMarco’s point about how loss of power affects projects was that all automated systems, once put into operation, result in someone losing and someone else gaining power. Hence, during the development of such systems, conflict resolution becomes a key talent. Large organizations, almost by definition, have conflict built into them. [It has been said by others than women often make better project managers than men because they are used to raising teenage boys and that’s a lot of what managing projects is like!]

Finally, DeMarco discussed the impact of demographics on organizations and their human resources. The IT industry has been used to getting their supply of new people over the last two decades from three sources: the baby boom, women coming into the workforce, and increased educational levels of the post-WWII generation. All three have begun to dry up as

· the baby boomers are now over 50 years of age and behind them is a trough of fewer 30–40 year olds,

· the new (20–30 year old) workers have been raised in an era of rapid change, short-term goals, lowered formal educational expectations, etc.,

· the IT field opened up early to women and women are looking elsewhere now,

· college students, in general, are not looking to technical/engineering careers and look more, for example, to careers in other professions (e.g., law),

· many companies are having to look for IT help from abroad where technology growth is occurring as it did many years ago in the USA,

· many companies will also have to look, not at 4-yr or even 2-yr colleges, but at technical and trade schools for new IT workers.

DeMarco claims that all of these trends suggest a higher organizational investment in people’s training since they will not be coming into the workforce with the skills needed. However, that will mean companies will need to figure out ways to retain the workers once they are trained. With a tight employment economy, turnover figures may look good for many companies now. However, DeMarco stated that, when the economy picks up, skilled and knowledgeable people will go elsewhere. He called this “latent loss,” i.e., people ready to go as soon as the chance presents itself. (Many such people may not be doing an organization a lot of good even while they are there, especially if they see their careers largely in their own hands and companies not willing to invest in them.)

Abran Keynote, “ Emerging Consensus on the SW Engineering Body of Knowledge”

Alain Abran is both a Director of the School of Advanced Technology at the University of Quebec as well as holding the (international) Secretariat position in the ISO/IEC JTC1/SC7 standards organization. In the former role, he has been actively involved with the IEEE and ACM in development of the Software Engineering Body of Knowledge (www.swebok.org) which the IEEE has used as the basis for their new Certified Software Development Professional (CSDP) program. The SWEBOK effort, in which I was involved as a reviewer for the early editions of the document, is now in a period of trial application by various colleges and universities throughout the world and has been presented to the international standards community as a guidance document for development of a true software engineering profession. It is expected that, by 2004, a final version of the SWEBOK will exist.

While much of Abran’s talk (given to close the first day) described the history of the SWEBOK, the SWEBOK structure, and trial adoption by various educational institutions, another part discussed the characteristics of a profession. This latter point is something organizations like the IEEE are attempting to promote and some companies are pursuing using existing professional certifications (like IEEE’s CSDP and ASQ’s CSQE). Abran noted that a profession is characterized by the existence of:

· a professional society (e.g., IEEE, ACM, PMI, ASQ),

· a code of ethics (e.g., IEEE/ACM have one as does the ASQ),

· accreditation of educational programs,

· recognized periods of skill development (i.e., experience working in the field), and

· certification and/or licensing of professionals.

One is recognized as having full professional status, then, when one has been through, ascribed to, and been recognized as a part of all of the above steps.

While Abran did not state this, it is also recognized that not all practitioners in certain fields need to have full, formal professional status. There are craft positions associated with construction, for example, that do not require professional engineering status. Hence, not all people who, for example, design, develop, and test software would have to become licensed “engineers” in software or software quality any more than everyone working in the medical field is a licensed physician or specialist. However, certain certification and licensing is done at other levels within other domains and software would, logically, follow the same pattern should professional certification and/or licensing come to the software product and service domain.

Musa Keynote, “Software Reliability Engineering”

John Musa spent most of his career at AT&T Bell Labs and I first heard him speak during the early years after the 1984 divestiture of the Bell System. He was the first person I recall having made the distinction, at least in software, between a fault and a failure: the former being an individual defect in a system which causes the system to fail; the latter being an instance of departure in system behavior from user needs. The important distinction is that a fault can cause many failures to occur until the fault is repaired. Hence, a system may fail many times though the source of failure is but one defect.

Historically, software measures have focused on defects and used defect density (i.e., number of defects per some unit of code size) as a quality metric. Even ignoring the arguments over what size measure is preferred, the fact is that quality, if measured through defect density, can appear to be improved even when it has not or, worse, even when it can have deteriorated. Simply increasing the delivered size of a software system can make the defect density decrease whether fewer, the same, or even more defects exist in what is delivered. What the customer may be seeing is an increase in failures while the defect density is dropping. Musa has been instrumental in addressing the issue of software reliability and basing it on failures per unit of system operational time or some “natural” unit related to system processing (e.g., pages of output, transactions, jobs/jobsteps run, queries, phone calls, etc.). System availability, though related to reliability, is an average (over time) that some system or system capability is functional.

Musa, as the opening keynote for the last day of the Conference, noted that to determine how best to test system reliability and availability requires creating some form of operational profile within which testing can be performed. Without such profiling, one must assume all scenarios of operation are equally important and equally likely of producing failures critical to reliability and availability. Since most users execute a large part of the functionality of many systems infrequently, to conduct effective software reliability engineering requires understanding what functionality in a system is most critical to achieving high availability.

Musa defined an operational profile as the “complete set of operations whose probabilities of occurrence add up to 100%.” That is, you describe all the operations to be performed in the system and determine each one’s probability of occurrence during system use. When you have done this the total of those probabilities is 100%. This would be true since the list of operations, presumably, describes everything that can occur in the system (at some currently acceptable level of detail) and how frequently they occur. As you add operations, the profile will change, but the probabilities should always add up to 100%. Musa acknowledged that developing such profiles can be expensive and “may be impractical for small components (involving perhaps less than 2 staff months of effort), unless used in a large number of products.” Musa also noted that new features and operations, of course, would have no field data to base a profile upon. However, this is where business and technical staff need to work together to estimate[profile information using the best understanding of user needs available. This approach, until better usage data is available, will always be preferable to making no profile assumptions and, hence, having no formal way to judge how best to apply test resources to new system features relative to the system as a whole.

What Musa believes profiles give you is a better what to provide “Just Right Reliability,” i.e., enough to meet major user expectations and do so within cost and schedule constraints. In closing, Musa noted a free Windows-based product for performing software reliability profiling called CASRE.

Lister Keynote, “ Risk Management and Value Assessment”

Tim Lister closed the Conference by speaking on the general subject of determining what the “right” projects are to do and how to manage them to ensure, to the greatest degree possible, on-time and on-budget performance by employing a risk and value based approach. He opened the talk, however, by noting that most software industry data shows how “nothing goes according to plan” and that we, therefore, have “no right to be surprised when it does not.” (I was reminded of Claude Rains in Casablanca being “shocked, shocked to find gambling going on.”) What Lister says he has seen happening over the years is an inability to terminate “totally bankrupt” projects, i.e., ones with no results, wasted resources, and continual eating of resources until someone from the outside kills it.

Lister defined risk as “any variable on a project that, within its normal distribution of possible values, could take on a value that is detrimental, even fatal, to your project,” i.e., “a potential problem.” However, we should not be dismayed about this because “all projects with benefit but no risk were completed long ago” and that “avoiding all risk usually lowers the value of a product.” The most common risk, Lister said, was that “we may not have enough time to build the entire product,” i.e., deliver everything the customer wants in the time frame given. And how do we get into this situation? Lister calls it “The Core Release Comedy” since not all parts of a system are of equal value, yet everything is a #1 priority to somebody. Lister suggests we “build so that when time is up we can say … we built as much of the most important stuff as we could.” [Otherwise, we end up doing things, not with urgency because they are important, but in a rush because they are late.]

Lister’s “Risk Ritual” consists of:

· Identifying risks (using some industry information to develop a guidance document and being willing to “talk about really bad things,” i.e., “institutionalize the Devil’s Advocate,” risk management “behind closed doors between consenting adults”),

· Assessing risk exposure (probability of a risk becoming a problem and the cost/effort impact if it does and, if multiple people have very different %s then they may not really be looking at the same risk),

· Determining which risks to manage (especially by building a resource contingency into the project plan capable of handling 50% of the cost of all risks occurring),

· Forming action plans for direct risks (sometimes by finding someone who can handle the risk at lower cost or probability because they can apply higher skill dealing with the situation),

· Forming mitigation plans for indirect risks (by putting “trip-wires” into the project plan, i.e., if by certain milestones not all designs are done, then certain actions are taken),

· Iterate risk consideration throughout project (since there is “no reason to believe that you can identify all risk in one go”).

Lister closed saying that “life is too short to be on projects that don’t matter.”

Other Talks

During those times when I was not delivering one of my own talks (or at the plenary keynote presentations), I attended other talks that addressed:

§ “The Application of SPICE to Quality Management Systems” — a talk on using the ISO 15504 Technical report version to conduct ISO 9001 audits and the improved information about process effectiveness such an approach has shown to provide,

§ Overview presentations on the CMMI (Integrated Capability Maturity Model from the SEI) and its continuous (ISO 15504-like) architecture,

§ “Certification: A Win-Win Investment for Employees and Employers” — on the value of CSQE and related professional certification programs.

I also have the handouts from a few other presentations that I could not attend, but which seemed to have potential interest:

“Software Quality in Web Application Transactions” (addressing use-case based testing),

“Software Defect Prediction Techniques” (using various established estimation and quality attribute models),

“Requirements Management: A Quality-Centric Approach” (addressing elicitation, change management, and traceability)

“An Integrated Graphical Assessment for Managing Software Product Quality” (using standard Quality Functional Deployment techniques),

“I’ve Been Asked to Review this Specification: Now What Do I Do?” (on how to approach doing (non-code) deliverable reviews),

“Demystifying Networking — Communication Skills for Project Success” (a double-session run as a ‘seminar’).

--

--