publisehd in J. System Safety, 36:1, First Quarter 2000.
|SAFETY STATISTICS = BUFFALO CHIPS
By Ira J. Rimson, P.E.
Statistically, if an effect (dependent variable) results every time that a precursor cause (independent variable) occurs, we may infer that the probability of that confluence is 1.0, and be 100% confident that it will happen. Confidence limits of statistical analyses depend on the number of cases available to test the likelihood of the normalized cause initiating the effect. [Being natural skeptics we allow for chance non-happenings and rarely commit to either probability or confidence level greater than 0.9999e (n).]
Two problems arise immediately when we attempt to apply statistics to "safety" data derived from mishaps reports: (1) the mishaps which generate the data are, statistically, "rare events"; they do not occur with sufficient frequency to generate a reliable population of happenings on which to hang a confident hat; and (2) in most cases it is impossible to assure equivalence of the events independent variables from the data generated by investigators, with the result that we have no assurance that any two independent variables are, in fact, identical. (More on this later.)
If we assume that "safety" is a condition characterized by absence (or complete control) of risk, and "accident" is a condition characterized by consummation of risk, then "safety" statistics should be the inverse of "accident" statistics. Likewise, if "safety" equates to successful, and "accident" to unsuccessful risk avoidance, then the risk itself may be considered to be the independent variable. Avoiding the statistical [apple=orange] trap demands that we assure meticulously that the risks we compare are truly comparable.
For example, a commonly accepted comparator statistic among aircraft models is "accident rate". Lets take the USAFs F-117 stealth fighter and its T-3A trainer as examples for comparing "accident" (or "safety" or "mishap") rates. The F-117 is a high-performance fighter operated by a solo crew, in long-range air-refueled over-water missions up to 14 hours long, deployment flights, weapons delivery and combat. The T-3A is used solely for "screening" ab initio flight students in dual flights e.g., with an instructor of 1-to-2 hours duration. Is it possible to compare the risks encountered in the airplanes operations? I think not. Whatever denominator you choose in attempting to normalize is merely an artifice which cannot withstand logical scrutiny. Rates cannot be compared fairly without weighing comparative risks. Operational differences will subvert attempted statistical comparisons among mismatched variables.
Statistical manipulators have tried many measures in attempting to quantify risk comparability. The Flight Safety Foundation posits that the most mathematically valid normalizing category in aviation is "takeoffs" (equating to flight segments), which supposedly optimizes the variability of leg length, flight time, passenger-miles, etc. Nonetheless, I must suspend disbelief to accept that the risks which might be encountered during four segments between LAX and JFK, anytime, are identical to those associated with a four-segment commuter flight; say, Denver-to-Vail-to-Crested Butte-to-Rock Springs-to-Cheyenne, at night, in January.
Vested interests labor mightily to generate statistical safety data and sell the public their marginally valid results. Using data from reports of events that have already occurred, they generate predictions of what will happen, when and where, to create a perception of prevention. Their predictions dont work for several reasons. First, current investigation methodologies rarely establish all the interactions which comprise What Happened during mishaps, with sufficient rigor to identify specific behaviors which must be modified to effect prevention. Second, investigators frequently collect their "facts" from informants several levels abstracted from primary data sources, thus encouraging filtration and perception inaccuracies. Third, investigators and analysts alike are seduced by notions of probabilities rather than treating actual occurrences deterministically; i.e., once an outcome occurs, the probability of its having happened is unity, and the probability of anything else having happened is zero. The probabilities associated with the interactions which occurred during its evolution are also binary. Whether the a priori probability was calculated to be 1x10-2 or 1x10-9 is irrelevant to the evolutionary process. Once it has happened, P=1 for each interaction required to produce the outcome.
Finally, all mishaps occur as a result of human behavior. As Dr. Bob Besco puts it: "Some humans are caused by accidents, but all accidents are caused by humans." Inanimate objects cannot initiate circumstantial mistakes. Human behavior interposes a statistical hurdle incapable of supervention: infinite variance. Structural failure may result from underdesign (deficient human designer judgment), overstress (deficient human operator judgment), or inadequate maintenance (deficient human support judgment). Helmreich et al have attempted to overcome human variability among flight crews performance with personality-oriented Crew Resource Management (CRM) initiatives. They are really attempting to overcome human variance by training all crewmembers to hew to a common paradigm. It might be easier to use robots as crewmembers, but I suspect that both the public and the pilots union would object.
We can only reduce mishaps by changing the human behavior which contributes to their genesis. Ludi Benner and I suggested some years ago that we change the current outcome-dependent definitions of "accident" and "incident" to account for the complete generational process. We must acknowledge that a mishap is the outcome of an unplanned interactional process which leads to an undesired outcome, before we can identify those points within the process where intervention could have effected prevention. Categorizing by outcome (degree of damage or injury) deliberately ignores the many interactions at which different behavior might have avoided the mishap. What happened to cause departure from the intended plan establishes the base line for subsequent interactions which led either to a successful unplanned outcome (safe landing with one engine out) or an unsuccessful one (crash with the same engine out). Competent investigations and analyses identify the human participants successful and unsuccessful interactions within the process, and treat each occurrence as unique until points of similarity with other mishaps are firmly substantiated.
System Safety was built on this fundament: if a mishap can occur, it will. It might be a while awaiting the proper confluence of stars and planets, but eventually humans will make it so. Conversely, if it cant happen, it wont. Therein lies the goal of prevention: make "cant happen" the reality. Rigorous investigation and analysis can reveal the deficiencies of our attempts at prevention. Statistics, on the other hand, contribute neither knowledge nor understanding.
But they sure can mess up your boots.