Organizational Threat Management (OGHFA BN)
From SKYbrary Wiki
|Content source:||Flight Safety Foundation|
|Operator's Guide to Human Factors in Aviation|
|Organizational Threat Management|
1 Background and Introduction
Today’s aviation world is characterised by growing integration of all professions involved, including flight, cabin and ground operations, air traffic control (ATC) and maintenance. This coexistence increases the need for a systemic view of safety management. Various types of hazards, precursors or threats contribute to operational risk, which is the product of hazard probability and severity.
The term threat in flight operations is defined as anything external that complicates pilots’ duties and requires attention and management to keep proper margins of safety. The European Working Group on Occurrences Data Analysis developed the following definitions:
- A hazard is a condition that has the potential for causing damage to people, property or the environment.
- A precursor is an occurrence that remained an incident but that might recur in different conditions and become an accident.
- A threat is a more generic term for a condition likely to cause damage to people, property or the environment, with hazards generated by catalysts or triggering factors.
Investigations often show that events with similar scenarios had occurred. Such incidents can then genuinely be called precursors. Precursors can be detected both statistically and clinically to develop effective countermeasures. Precursors should be sufficiently visible to generate operational defences for pilots and prevent accidents.
The term threat management is increasingly used to describe techniques called countermeasures. These are essential for avoiding and coping with possible internal or external threats to a flight crew, such as regaining control after loss of situational awareness.
The objective of this briefing note is to familiarise the reader with key concepts of threats and their manifestations and to review various ways to focus on threat management.
2 Defining Threats
2.1 Basic definitions
Threats in normal operations are defined as external situations, events or errors that occur outside the influence of the flight crew and must be managed. As such, these threats can be considered as expected. Such events increase the operational complexity of the flight and pose a safety risk to the flight at some level if not properly handled.
Threat anticipation is a key ingredient in crew briefings and implies discussing:
- What have we been doing right?
- What are our strengths?
- What are our weaknesses?
In unfamiliar or abnormal operations, a threat can also be considered as a surprising or unexpected combination of events for which the crew is not prepared or even trained. There also are expected threats that the crew is supposed to routinely handle based on their training to face abnormal situations.
But even if a crew is trained to handle certain events, there may be no clear course of action for a combination of events. The outcome might be uncertain, and there could be a lack of established procedures to cope with developments. The crew may need more resources to master the situation safely.
When situational awareness is lost, resources might become insufficient and workload so high that the crew may become unable to monitor the developing situation. From this point of view, threat management is directly linked with loss of situational awareness.
2.2 Data: What threats need to be managed?
Active threats are expected or unexpected but observable ones that are linked with unsafe acts or inadequate defenses and are mediated by interactions with local events. Typical threats include local triggering factors that help release the hazard and local escalation factors that prevent effective control of the threat. Such factors include:
- Environmental threats such as: adverse weather (e.g., heavy rain, thunderstorms, wind gusts, wind shear and turbulence); air traffic control (ATC) problems such as challenging clearances, runway changes, language difficulties, controller errors, similar call signs, heavy traffic and radio congestion; terrain, including high ground, topography, facilities and lack of references; and airport conditions such as snow, reduced visibility, poor braking action or runway construction.
- Operational threats such as: airline operational pressure, including crew scheduling problems, delays, time pressures or diversion to an unfamiliar airport; aircraft problems such as loss of control and false warnings; cabin situations such as door security and sick or disruptive passengers; maintenance factors such as incorrect repairs and errors; ground-related events, including deicing, ground crew errors and fuel spills; and dispatch-related errors.
Latent threats are not directly related to a specific threat, as they are usually not seen in actual operations. Their primary origins are the fallible decisions taken by senior and line managers.
These threats are then further transmitted via other intervening precursors to the point where system defenses may be breached in daily situations and activities. For example, the time before duty either at home or on a stopover can be crucial for crew performance in flight.
Latent threats or failures concern aspects of the system that predispose the commission of errors or that can lead to undesired aircraft states. They are usually discovered when analysing aggregate data such as confidential incident reports. Causes include:
- ATC practices;
- Air traffic system design;
- Organisational, national or professional culture;
- Regulatory practices and oversight;
- Training philosophy and practices;
- Qualification standards;
- Aircraft characteristics;
- Equipment design issues;
- Flawed procedures;
- Preparation for duty;
- Scheduling and rostering practices; and,
- Personal unfitness due to stress, preoccupation or illness.
3 Statistical Data on Threat and Error Management
3.1 Line operations safety audits
The idea of identifying threats stems from the military, but one should clearly bear in mind that threat and error management (TEM) is a safety strategy that should not be confused with other initiatives that have generated other lists of safety issues.
The University of Texas (UTX) at Austin developed the TEM model as a conceptual framework to interpret data from observing crewmembers in both normal and abnormal operations.
A crew is expected to manage a threat so it becomes inconsequential. If the crew fails to do so, they may make a mistake. Depending on the crew’s response to that error, even more mistakes may follow that place the aircraft in an undesirable state that compromises safety.
A mismanaged threat is one that is linked with or induces flight crew error. As such, the TEM model helps to point out contributory factors that crews can recognise and manage.
The UTX line operations safety audit (LOSA) and the Airbus line operations assessment system (LOAS) are distinct programs that use expert observers to collect data about crew behaviour and situational factors during normal flights.
3.2 Data from other safety initiatives
Current safety initiatives are based on two distinct and complementary approaches:
- An historic/reactive process that reviews mishaps from eight focus areas: controlled flight into terrain, approach and landing, loss of control, aircraft design, weather, occupant safety and survivability, runway incursions, and turbulence.
- A prognostic/proactive process concerned with future hazards in two focus areas: crew reliance on automation and new concepts for airspace management.
The European Strategic Safety Initiative (ESSI) closely cooperates with the Commercial Aviation Safety Team (CAST), which runs a similar program for the U.S. Federal Aviation Administration (FAA) and other interested parties. Both programs learn from each other, and address the historic and prognostic approaches.
CAST’s mission was to develop and implement a data-driven strategy to reduce the accident rate by 80 percent by 2007. The list of standard events was set up by the CAST/International Civil Aviation Organisation (ICAO) Common Taxonomy Team to generate safety-performance indicators that help produce incident rates.
CAST members originally elected to selectively pursue highly leveraged safety intervention strategies that would maximise the safety benefit to the flying public by chartering Joint Safety Analysis Teams (JSATs) in the following six areas: controlled flight into terrain, approach and landing, loss of control, runway incursions, weather, and uncontained engine failure.
JSATs use a problem-statement methodology that CAST recommends for identifying safety problems in any setting and for developing programmes to solve them.
|Based on some 300 problem statements, the following 10 groups were identified:
JSAT provides to CAST a list of prioritized safety intervention strategies to reduce fatality risks. CAST and JSSI teams recommend safety intervention strategies and also produce contributing factors grouped and prioritized in a set of standard problem statements. If these high-leverage problems and contributing factors could be eliminated, the risk of death or serious injury would be reduced by an estimated 73 percent.
4 Threat Management Approaches
An operator’s organisation must gear itself to implement prevention strategies through appropriate ab-initio and recurrent training, as well as through operational recommendations to provide pilots with the necessary skills, knowledge and attitudes to help them understand the challenges they face.
TEM can provide a conceptual framework for flight crew training and serve as a crucial component of human factors and CRM training. In particular, it can lead to a template for assessing threats during operations. The following approaches may help.
4.1 Crew-observation techniques
Situational factors in environmental, organisational and technical fields contribute to create threats that occur out of the cockpit but have to be managed by the flight crew. LOSA is predicated on the UTX threat and error safety model, which is based on these five error categories:
- Intentional noncompliance error;
- Procedural error;
- Communication error;
- Proficiency error; and,
- Operational decision error.
With three possible responses to these error types:
- Trap actively after detection and manage to an inconsequential outcome.
- Exacerbate and cause an additional error or undesirable aircraft state.
- Fail to respond because the error was ignored or undetected.
With three possible responses to an undesirable aircraft state:
- Mitigate actively by returning to safe flight.
- Exacerbate and induce an additional error or further degraded state.
- Fail to respond because the error was ignored or undetected.
TEM places in context system factors, organizational characteristics and pilot culture.
4.2 Integrated threat analysis
The International Air Transport Association (IATA) Human Factors Working Group launched integrated threat analysis. The goal of this approach is to establish a correlation between threats, errors and undesirable aircraft states in normal operations, incidents and accidents to determine some scenarios in which safety can be compromised and to develop crew prevention strategies to properly manage these situations. The first integrated threat analysis focused on runway excursions.
IATA performed parallel analyses of UTX’s “Archie” database (normal operations), IATA safety trend evaluation analysis and data exchange system (STEADES) incident narratives, and the ICAO’s accident/incident reporting (ADREP) database.
A link was established between long and off-center landings. Weather (e.g., heavy rain, thunderstorms, wind gusts, tailwinds), aircraft malfunctions, rejected takeoffs, night operations, and proficiency issues remain important threats in incident and accident analysis.
Experience shows that ATC, operational pressures and procedural issues are less well-documented threats with risks that need to be managed both organisationally by the airlines and tactically by flight crews.
These three strands of safety-data management complement each other. They show that we can learn from well-managed threats as well as from mismanaged ones.
4.3 Airline data collection
Incidents and accidents are daunting events that can be prevented by setting up defences. Operational situations potentially lead to systemic weaknesses, breakdowns or failures that correspond to threat events — precursory, isolated occurrences that might recur in different conditions and then become an accident.
- Precursory operational situations have failing defences (not necessarily precursory) with preventive practices in operational contexts that must be clearly identified.
- A practice is a way to work adopted by the crew or by an organisation such as maintenance or ATC that is characterised by the way an action, error or procedural deviation or latent condition is performed.
- The operational context takes into account several elements both in the environment and pilots’ workload and stress.
- Failures in defences will manifest themselves in the execution of practices and in specific contexts.
The visibility of precursors may be poor and the cross-working of several safety sensors may help reveal various elements that characterise a precursor — its frequency, characteristics and criticality. This suggests that sources other than LOSA are necessary.
LOSA will hence not be able to catch latent conditions, which can only be grasped through precursors, incidents and accidents. And the identification of countermeasures is possible only if the three specific aspects of an operational situation — defences, practices and context — are clearly identified.
4.4 Safety data management
Airbus performs safety data management (SDM) to support its product safety process. SDM collates operational incident and precursor data from a variety of sources and codifies these events by means of operational and human factors markers through its operational events analysis process (OEAP). The OEAP studies produce data for design, training and operations.
This process attempts to codify threats as:
- Anticipated (briefed);
- Avoided (prevented);
- Managed (recovered);
- Identified, assessed and managed late;
- Outcome mitigated;
- Not identified; and,
- Outcome inconsequential.
This activity is performed in conjunction with a review and codification of:
- Situation recognition and diagnosis — alerts, cockpit-cabin interaction, crew diagnosis;
- Crew actions — flight guidance; systems use; excessive control inputs; untimely, inadvertent or delayed action; no action despite repeated alerts; late takeover;
- Procedures — type, access, contents;
- Crew performance — procedure execution, actions, error management, threat management, aircraft attitude and flight path control, coordination; and,
- Environment and circumstances — operational, weather conditions, aircraft design, organisational factors, crew factors.
The Occurrence Data Analysis Working Group of the JSSI drafted an Analysis Capability Specifications document to recommend guidelines for precursor detection. Precursors, or forerunners, are events that come before something similar that leads to or influences its development. There are two methods for precursor detection: scenario-based and hazard-based.
Scenario-based precursor detection consists of diagnosing incidents related to specific threats and identifying various human, procedural, technical and environmental factors, including warnings and alarms. This identification consists of:
- Verifying the scenario’s efficiency, which is confirmed if it prevents the incident from occurring or degenerating into an accident.
- Verifying its reliability, which is confirmed when an incident reoccurs in exactly the same conditions.
- Verifying its universality, which is confirmed if the incident reoccurs in slightly different conditions — another airline, airport or aircraft type.
Hazard-based precursor detection involves either a statistical approach by tracking the list of hazards in the CAST and JSSI initiatives, or a proactive approach during design and certification based on JSSI and CAST data. This analysis can be done in two steps:
- Upon receipt of a report, the analyst identifies which hazard might be involved.
- The analyst then either flags it in the database for later retrieval or stipulates where all the occurrences related to each hazard should be reprocessed.
Occurrence retrieval for any given hazard involves categorising events and making a synthesis to validate the hypotheses of the safety model, maintain risk awareness and identify needs for safety actions. This analysis is based on the expertise and judgment of a dedicated expert team since there is no exact guidance.
5 Key Points
Technical information is not the best way to prepare pilots to master expected and unexpected threats while flying. It is nevertheless necessary to feed back information on actual experiences to the authors of such technical information.
It is one thing to develop organisational schemes based on safety data from well-documented events. Yet, it is quite another to learn to master such events in a cockpit environment with a real situation unraveling in the face of incomplete awareness or surprise.
Risk management aims to make events visible by creating proper attitudes for TEM, not by mobilising resources out of context but by prompting crewmembers’ situational awareness when needed. An example is when a traffic alert and collision avoidance system (TCAS) generates a traffic advisory ahead of a resolution advisory to get the pilots’ attention and prompt a timely and appropriate response.
It would be in vain to require optimal performance at all times to be able to deal with ongoing threats. What is necessary instead is to have sufficient built-in risk tolerance and the proper attitudes to face any threats. This is achieved with a pragmatic approach to training and operational strategies.
6 Associated OGHFA Material
- Fuel Leak and Confirmation Bias
- In-flight Pilot Incapacitation
- Rejected Takeoff
- TCAS Occurrence
- Threat and Error Management Preventing CFIT
7 Additional Reading Material
- ICAO. Accident Prevention Manual (Doc 9422).
- ICAO. Human Factors Training Manual (Doc 9683).
- ICAO. Line Operations Safety Audit (Doc AN/761).
- ESSAI. Enhanced Safety Through Situation Awareness Integration in Training. Final report, Feb. 4, 2003.
- IATA. Safety Report 2004. April 2005.
- JSSI. Analysis Capabilities Specification Document MOR. Occurrences Data Analysis Working Group.
- ICAO/IATA. Proceedings of the First ICAO-IATA LOSA and TEM Conference, Dublin, Ireland, November 2003.
- Maurino, Daniel; Reason, James; Johnston, Neil; Lee, Rob. Beyond Aviation Human Factors. Avebury Aviation, 1995.