The Investigation Process Research Resource Site
A Pro Bono site with hundreds of resources for Investigation Investigators
Home Page Site Guidance FAQs Old News Site inputs Forums
. . . .. . . . . . . . last updated 3/23/07

INVESTIGATING
INVESTIGATIONS

to advance the
State-of-the-Art of
investigations, through
investigation process
research.



Research Resources:

Search site for::


Launched Aug 26 1996.

 

GUIDELINES FOR INVESTIGATING CHEMICAL PROCESS INCIDENTS
Center for Chemical Process Safety, American Institute of Chemical Engineers,
345 East 47th Street, New York, NY 1992
Prepared by Ludwig Benner
11/93 (ed 11/96)
Book was revised in 1999. Review not yet scheduled.




Book Review

Overview of Guidelines

This publication holds itself out to present "techniques for investigating incidents of a serious nature, whether they result in accidents or not, whether they have in plant or off-plant consequences, or whether they are characterized by actual or potential loss of life and/or property or damage to the environment. Guidance is also presented for the initial establishment of an investigation team and establishment and evaluation of a management system for incident investigation. Lastly an annotated bibliography is included for safety professionals who may wish to refer to the many books available on incident investigation.

The books says major components in the investigation of an incident are

  • Identify the root causes
  • Determine recommendations necessary to prevent a recurrence
  • Ensure that action is taken on those recommendations


This book is based on the new approaches and covers the state-of-the-art technology and practices in use for thoroughly investigating an incident." (p xii)

The stated objective of the guidebook is to provide a technical foundation for systematic investigations in order to prevent recurrences of incidents. Investigations take place within the context of a Process Safety Management system, providing the "feedback loop" in that system. (p1)

The stated objective of incident investigation is to prevent a recurrence. This is to be accomplished by establishing a management system that identifies and evaluates causes (root causes and contributing causes), identifies and evaluates recommended preventive measures that act to reduce the probability and/or consequence, and ensure effective follow-up action to complete and/or review all recommendations. (p1)

The basic framework for a process safety incident investigation (PSII) is claimed to be represented by a well defined, basic incident terminology , a recognized incident theory , and an incident classification scheme . Chapters include an Introduction, Basic Investigation Techniques, Investigating Process Safety Incidents, Practical Investigation Considerations: Gathering Evidence, Multiple Cause Determination, Recommendations and Follow-through, Formal Reports and Communications Issues, and Development and Implementation. Appendices present examples.

Comments about contents


Strengths
The book provides many useful guidelines for the establishment of investigation capabilities, techniques available for investigations, recommendation practices and many literature citations. It has many strengths, including its

  • recognition of investigation planning needs
    framework for fitting process incident investigations into a safety management system
  • recognition of the complexity of incidents and their investigation
  • recognition of levels of investigations
  • awareness of investigation methodology options
  • investigation staffing discussions
  • discussion of facility restart needs for investigators
  • calls for the review of prior hazard analyses during investigations

These elements are often lacking in other investigation publications.

Shortcomings
Unfortunately, the Guidelines perpetuate traditional investigation program shortcomings, and add a few of their own, such as

  1. muddled incident theory
  2. misdirected investigation objectives
  3. flawed terminology
  4. reliance on subjective judgment calls
  5. inappropriate use of techniques
  6. investigation task omissions
  7. legal-oriented investigation report framework

Each creates significant problems if the Guidelines are followed. They are important because while the guidelines may result in better investigations than at present, the potential value and effectiveness of investigations within organizations will not be achieved efficiently, consistently and verifiably by following this guidance.

The cliche highlighted on p 69 "process safety incidents are the result of management system failures" will not exactly make managers eager to jump through hoops to support the recommended investigation program, particularly with the other weaknesses described below. As investigators intensify their search for management system failures, managers would be well advised to scrutinize investigation system failures - as indicated by continuing incidents and accidents - equally thoroughly.

Muddled Incident theory


The incident theory discussion is confused. For example, page 14-20 present several theories including an anatomy of the process-related incident represented as a branched sequenced chain of events model with arbitrarily designated phases. The basis, purpose, relationships or value of the incident phases in the anatomy is not shown, and requires more judgment calls. The model does not provide a basis to help determine the scope of what should be investigated. Elsewhere (page 16+) theories of accident causation are presented. The Guidelines claim that the systems theory and multiple-cause determination is the most widely accepted and adopted incident theory, without substantiation except for a 25 year old paper.

A summary of incident theories (page 17) confuses (a) theories about the nature of the accident or incident phenomenon with (b) theories of accident or incident causation. By glossing over these differences, the Guidelines bounce between developing descriptions of what happened and determining causes. This confuses the Guidance, as on page 165,where a diagram of what happened is presented as events and conditions that led to the major incident. Causes depend on how data from case are interpreted, which in turn depends on experiences among the team. Other examples of the resultant confusion are found throughout the Guidelines.

The Guidelines describe in summary detail the many investigation techniques considered, but there is no evidence that they were applied competitively to determine relative merits of each in the context of the systems theory of incidents. Without any evidence to show that the differences in the investigation work products and results achieved were tried, it is impossible to verify that the methodology selected, which just happens to be one used by one the members of the committee that was responsible for the Guidelines, is more meritorious than any of the others. The summaries of techniques developed are not related to the incident model, and in some cases misdescribe the techniques and their application in Table 2-1, and miscategorize the techniques in the text.

Unfortunately, in its discussion of incident theories or causation theories, the text does not describe

  • how to define or describe a system which experienced the incident for investigation purposes,
  • how to identify the beginning input or ending output of the accident process, or
  • how to apply to investigation processes the input/ operation/ output/ feedback system model familiar to system analysts.


Thus neither the incident model nor the investigation process model seem to conform to systems theory espoused, casting a shadow on the claim that the PSII is represented by a recognized incident theory.

Misdirected investigation objectives.


Doing investigations the goal of determining causes and preventing similar accidents builds problems into investigations. First, contrary to the caveat on page 5 about viewing incidents as opportunities to improve management systems, cause is synonymous with blame and fault in our society, whether we like it or not. Attribution of cause is a judgment call, not subject to rigorous logic and validation testing. That leads to unneeded social and legal problems for an organization which should be construed as investigation failures. An example of this kind of investigation failure is the former view that many aircraft landing approach accidents were caused by pilot error - widely reported prior to developing an understanding of the wind shear phenomenon during better investigations. Enlightened organizations, relying on the theory that an incident or accident is a process, now ask first for a timely, objective and validated description and explanation of what happened, rather than calling for causes. The investigation output must describe interactions during the incident process and show logically why they happened, while acknowledging unknowns. These process descriptions show internal cause-effect relationships among interactions , without calling them causes or making observations about unrelated problems, or offering recommendations.

Most differences and controversies arise over the subjectively-determined root or prime or contributing or multiple or immediate or proximate or other causes and subjective judgments or opinions about problems and recommendations. The main deficiency of root cause concepts, for example, is cited on p 129 of the text, but the cure is subjective and dependent on defining and selecting still another type of cause - prime cause of the incident. Causation theory and cause-based thinking leads to such a tangled web of ambiguity, abstractions and opinions!!

Secondly, investigations cost money. If the objective is to prevent only similar accidents or a recurrence (p 127), you limit the potential for the broader risk reduction and performance improvement you should expect from your investment in any investigation. The Root cause analysis fad only feeds the silver bullet mind set. Maximizing the cost/benefit ratio by expanding the benefits and uses of investigation outputs is not well served by RCA. Descriptive rather than abstract outputs are needed to aid in improving longer term process design or operations, efficiencies, costs, training, personnel selection, and more (page 6) - plus a way to track these results. They should be demanded for any investigation program. These programs should pay their way in a demonstrable way. but unlike other aspects of a business, demonstrations of investigation payouts are rarely demanded or monitored.

Third, there seems to be no provision for investigating near miss disruptions or incidents to discover how and why they were successfully managed to prevent even greater losses. Such incidents can be viewed as successes, in that larger losses did not occur. The search for actions that aborted the accident processes successfully to prevent larger losses seems to be ignored in the Guidelines, yet these data can provide insights into timely process improvements by building on the successes observed.

Flawed terminology.


Though touted as well defined, the Guidelines' terminology has fundamental flaws. Terms like cause (described earlier), event, fact, evidence and others are either not defined or defined ambiguously, or defined one way and used in other ways. For example, the abuse of the term "event" in the book leads to ambiguities and confusion, and frustrates imposition of objective quality control procedures on the investigation process. Page 8 defines an incident and an accident as an event. Page 21 says "a component ...appears as an event in the causal tree." On page 28 we find an aggregated passive description of multiple occurrences "210 persons killed" as an event on a chart. Thus under the Guidelines, an event can be an action, a condition, an aggregated description in the passive voice, or a component. Consistent definition and use of the term "event" is imperative to the objective application of any of logic, charting or analysis techniques described in any investigation guidelines. Undefined, the term makes objective data organization, testing and quality control impossible.

Other terminology deficiencies include pejorative and abstract words which reflect the conceptual weaknesses. Human or operator error is an example of a commonly used but pejorative and abstract term. It is the term resulting from a subjective investigator conclusion - imputing blame and fault (as in the case study on page 19.) From the perspective of the person who is charged with erring in situations similar to the wind shear experiences encountered by pilots, this is not constructive. It results from what is called the investigator's retrospective fallacy so well described by Diane Vaughan in The Challenger Launch Decision. Similarly an ambiguously defined "failure" - without a clear description of what everything and everyone else involved actually did and was supposed to do - is pejorative to the person(s) who designed, used, made or maintained a failed device. An investigator can not determine whose or what behavior to change until you know the involved actors and the behaviors that produced the outcome.

The gravest risk is that use of these terms allows an investigator to mask a lack of understanding of what exactly happened and why it happened due to their inadequate investigation methods, knowledge or skills..

The detail in which a "failure" is ultimately described drives the remedial actions. Recommendations for future actions require someone to do something differently. The guidelines do not specifically speak to finding the specific behaviors to be changed, the rationale for describing them as problems, and the definition of what behavior should be substituted are not addressed in these terms. The guidelines also use the passive voice frequently in describing the sequence of events, as in the model application in Appendix F, for example..

Failure, human error and the passive voice are typically used by novice because they don't know any better, or arrogant and undisciplined investigators who should know better. When used by experienced investigators they typically hide uncertainties or incomplete or sloppy investigation practices. Advanced investigation programs do not tolerate these kinds of pejorative words, abstractions or ambiguities and their usages.

Another problem is that in a section on recommendation development (page 171) the Guidelines advocate identifying some failures as a cause, and addressing them with a recommended preventive action item or comment. This implies that only some failures will qualify as root causes. How can an investigator select and logically justify the "right" failures to promote as root causes?

Subjective judgment calls.


The Guidelines describe three alternative approaches to PSII, which are actually differing levels of investigation effort. The approaches and tool kit predominantly call for individual experience-based judgments and team meetings to arrive at a consensus about what happened and when the incident is adequately understood. Advanced investigation programs minimize judgments, because differences in each individual's experiences, self interests, perceptions and sensitivity can lead to inconsistencies in interpretations, differences in conclusions and conflicts during investigations, including eventual litigation. Advanced technical methods should be used to reduce the need for judgment calls, to promote the use of objective logic tests, and to avoid the biases introduced by a consensus-building methods.

One vital area of judgment calls during investigations that is completely overlooked in the Guidelines is the judgments made to select how investigators observations during investigations will be documented. The transformation of an observation into a data item for the investigation is ignored, yet it is an essential knowledge and skill need for investigators. This further compounds the lack of definitions of evidence and facts, and thus ignores the need for rigorously tested conclusions. A glimpse at the diverse nature of the contents of boxes in the illustrative charts discloses the inconsistencies in the data used, and the need to address this need in a good investigation program.

Misuse of techniques.


One of the major problems with the Guidelines is the reliance on the logic tree-based methods recommended for use during investigations. Logic trees have one valuable purpose during investigations: to deductively develop hypotheses for bridging gaps in the scenario constructed from the data available as the investigation progresses. The main problems with logic trees in investigations are their

  1. their inability to show time relationships among events constituting the incident process,
  2. their inability to efficiently handle data as data are acquired,
  3. the difficulties in selecting consistent top events, levels of detail or tree content, and
  4. the inefficiency of their data capture, organizing, validating, linking, and testing procedures as the investigation progresses.

The format simply does not accommodate all the interactions that occur during an incident or accident, and compel investigators to apply a linear thinking process model. Additionally, even all-AND-gated trees pose difficulties in defining systems for problem and recommendation analyses. Properly disciplined flow charts describing who or what did what when are able to overcome these difficulties, but they are not detailed in the book.

A fault tree used to illustrate the handling of data shows OR logic gates to some of the blocks. OR gates point to incompleted investigation tasks and thus unsubstantiated hypotheses. Yet the Guidelines leap to conclusions about faulty management system breakdowns as root causes, without defining and verifying what actions broke down, and why they did so. This example shows clearly how techniques can be used inappropriately during investigations.

Time lines for recording events are illustrated but data used in the time lines are not disciplined for consistency of form or content in the examples. Timing relationships among parallel events are not accommodated. The events are presented in serial format which can not support rigorous sequential and necessary/sufficient logic testing to establish unambiguous, verifiable cause-effect relationships.

Investigation task omissions


The Guidelines address many significant investigation tasks to a greater degree than most other similar publications. To their credit, they contain useful guidance for investigations, particularly the Recommendations and Follow-through sections. Deficiencies flowing from subjective judgments and conclusions about cause aside, task descriptions cover many needs and techniques. However, the Guidelines overlook or gloss over several important task aspects of investigations. These omissions or deficiencies include, among others,

  • a general discussion of the concepts and procedures for framing questions and directing observations to recognize, find and define unknowns and unknown unknowns or unk-unks as the investigation progresses.
  • guidance for transforming observations into documented data; Is every observation a fact to be listed? If not how is the discard decision made?
  • organizing known data to define specific additional data needs as the investigation progresses; what drives the search for remaining data efficiently.? Hypothesis generation is addressed within a fault tree framework only.
  • a framework and guidance for data acquisition from witnesses and setting question priorities during interviews; in what sequence should one ask what kinds of questions, and then test the data in real time during an interview, for example?
  • a framework and guidance for acquiring physical data from objects of all kinds; how do you translate observations of objects into documented data?
  • the need for and guidance in developing test plans before any testing or simulation is initiated; how does one ensure before tests begin that all the needed data are extracted from objects during actual destructive testing?
  • problem definition, assessment and selection decision tasks; how does one discover, define and assess objectively specific concrete problems disclosed by investigations?
  • identifying options and predicting remedial action effectiveness; how does one "invent" remedial action options, and develop data needed by action selection the tradeoff process?
  • designing and implementing recommendation effectiveness monitoring plans; how does one combine system descriptions with the investigation process descriptions to define an implementation effectiveness monitoring plan (not just were they implemented but did the produce the desired results?)


A second general area of omissions is the meager discussion of investigation work product and investigation process quality assurance procedures. Combined with the emphasis on subjective judgments and the use of logic tree constructs, this oversight undermines the possibility of applying objective quality controls, and of achieving consistent, replicable and effective investigation outputs. Clarity and accuracy criteria are too subjective and abstract to be used for assuring objectively the quality of written reports (page 196). A check list addresses quality control of the form of a report, to the near exclusion of assuring the quality of the substance. Nowhere is logic testing of the incident description mentioned in the report quality assurance section..

The lack of comprehensive discussion about the process by which investigators discover, define, evaluate and select problems to be addressed by recommendations is a significant omission. The Guidelines assume failures will point to root causes. Theoretically, all interactions required to produce the observed output in an incident or accident scenario must have occurred. The Guidelines propose looking for a recommendation for each "cause" or (underlying cause? Root cause?) which is a subjective and usually an ambiguous problem definition. What about a look at other interactions that were not selected to be called a failure or cause? No mention is made of the more efficient ways to define and rank the problems indicated by unwanted interactions in terms of the vulnerability or efficiency of future operations if they are not fixed, and to determine if they should be addressed with recommendations. The weighting, weighing and trade-off elements of this decision process are not addressed.

Investigation report framework


The reporting format is traditional, following the legal framework construct adopted by leading federal agencies, rather than a framework better suited to organizations' operational improvement and management needs. In advanced investigation programs, those needs are addressed by a tested description of the incident or accident process, showing validated interactions in sufficient detail to convincingly explain why it happened the way it did. References contain validating data in file documents or objects, should supporting information, corroboration or proof (evidence) be required. The case for making a separate report with recommendations based on past incidents and projected operations is being recognized in a number of organizations. Thus the Guidelines reflect traditional practices, rather than advanced state of the art practices.

The difficulties resulting from the uncritical acceptance of subjective experience-based conclusions about evidence, facts, cause, and failures
should be recognized and remedied.

Summary


On balance, I recommend the book for the serious researchers and students of investigation processes, to provide them with valuable insights into and an understanding of the kinds of investigations likely to be encountered in the process industries. The Guidelines contain good thought starters and the scope of the coverage, with the exceptions noted is comprehensive and stimulating.

Unfortunately, the deficiencies detract from the potential effectiveness of the Guidance when it is applied. If effective, efficient, consistent, verifiable and lasting organization-wide improvements are being sought from investigations, in my opinion they are not likely to be achieved by implementing the Guidelines without changes. At a minimum, make changes necessary to remedy the shortcomings described above. Other needs for refinement of the Guidelines will be observed as objective, logical and replicable investigation and work product quality assurance procedures are instituted.


The views represent those of the author only.


>