Excerpted from Systems Architecting, Creating and Building Complex Systems, Eberhardt Rechtin, Prentice Hall, 1991.
ARCHITECTURAL RESPONSE I: ULTRAQUALITY WATERFALLS
Process architecture is usually thought of in the context of a manufacturing system - machines, layout, procedures, product mix, flexibility, and the like. But, like any system, manufacturing is a part of a still larger system, the waterfall. If an ultraquality product is to be produced, not only the manufacturing step but waterfall as a whole must be of ultraquality, excellent beyond measure.
A study of present-day waterfalls would soon reveal each element being carried out by a different group within its own culture striving for individualized objectives. Each group would have its own perception of the waterfall and its position in it. Each would have its own language and its own computer aids (computer-aided design, computer-aided manufacturing, etc.). Computer communications between groups would be rare and inefficient.
If ultraquality is taken to mean a near-zero error rate, most waterfalls could not qualify. There are too many possibilities for error.
The first step to ultraquality waterfalls is the same as the one already given for manufacturing - perfection at every step. The client and architect must be sure that there are no misunderstandings of system objectives nor of the rationale for trade-offs already made. Research and development must search out possible failure modes through careful experiment and analysis. Engineering should progressively redesign the product and process, designing out single-point failure modes, simplifying the product for excellent manufacture. Each element in the waterfall must pass along as perfect a product—as complete and free of discrepancies and potential failures—as possible.
In several major government agencies, systems are conceived and justified at a very high level. Analyses are done that compare alternate approaches, some of which are relatively each other in benefits and costs. A high-level consensus is reached and, in due course the system is approved, perhaps with modification. Instructions are then sent to subordinate levels, but without including the prior analyses or the reasoning behind the modifications and trade-offs. The subordinate organization reinterprets the instructions to its subordinates in its own "language," again without explanation. Requests for proposals are sent out to contractors, stating but not prioritizing the specifications. Contractors within the specifications, no doubt wondering why, given the realities of the manufacturing world, a somewhat different approach was not requested. The process is vulnerable to conflicting objectives, easy misunderstanding, stifled feedback, and ill-founded trade-offs.
Prototyping, as discussed earlier, is usually carried out under relaxed rules, as it should be. Unfortunately, the transitions to production and later operation are notorious for cost and schedule overrun. The developers too often assume their responsibilities end with successful demonstrations. Going back, reviewing, and documenting the appropriate difficulties and solutions so that others further down the waterfall are as informed as possible—much less suggesting how a redesign should be done—is rare.
One of the most common failings of research and development is the passing along of immature technology to manufacturing. It is the greatest single cause of overruns in space systems, integrated circuits, and other high-tech products.
A new rocket engine was needed in order to launch larger, and much more expensive payloads to orbit. Reliability requirements were correspondingly much higher the earlier engines. The contract was undertaken by a firm that had built successful engines for earlier, less demanding, missions. But the development phase was not upgraded accordingly. Funding was short and false confidence in earlier successes was high. A major developmental omission, therefore, was a test program to determine the thermal characteristics of the new engine. The design instead was certified "by similarity" with other designs. When an engine later failed during flight, the cause was inexplicable. 18 months of investigation followed in an effort to find the true cause, excluding all others. The cause was a flawed thermal-barrier design unique to that engine. Had the engine been thermally characterized in the beginning, the problem would have been identified and solved, and an 18-month delay of, and risk to, several hundred million dollars of spacecraft could have been avoided. The high-quality objective of the system had not been matched by high-quality development.
The second step, an architect's speciality, is to improve the interfaces between the elements. System-critical changes in one element need to be communicated quickly and accurately to all other elements so that the consequences throughout the system can be determined and accommodated or debated in a timely fashion.
One of the promising approaches to interface improvement is to integrate the computer aids that each waterfall group has developed. At present, the separate computer aids are generally incompatible. There are few common data bases. Software and hardware are different. Entry, control, and updating are major problems. One possibility, involving internal changes in the elements is to integrate all the aids to a single master aid. But success in this proven to be difficult. Everyone in each field would have to be reeducated to a new language; configuration control would be very difficult. Another possibility is to use computer aids specifically designed to bridge between the separate aids-a transfer model or an interpreter analogous to instant language interpretation in a multilingual conference.
Attempts have been along these lines. Progress has been slow and frustrating. But the cause is not the lack of separate computer aids. It is the present lack of integration of the waterfall steps into end-to-end process architectures. Computer integration requires mutually agreed upon answers to questions that had been treated separately in the different groups. The waterfall itself may have to redesigned several times over in an effort to eliminate intergroup errors. For example:
What documents and data that flow between groups are most beneficial to the parties concerned? What information is critical? What information adds value to the product and which does not? How are differences resolved and documented?
Who should be responsible for keeping the waterfall up to date? Is information entered and displayed at the right locations? Does the same information have to be entered more than once? Is information entry a bottleneck on the production line?
What is the most efficient top-level computer language for the whole waterfall?
The principal benefits of a computer-integrated waterfall may be as much in the integration process as with the end result.
Software Architecture as Code