While en route from Sydney, New South Wales, to Kuala Lumpur, Malaysia, the oil pressure pump for the right engine (engine 2) of an AirAsia X Airbus A330 experienced a shaft failure. That shaft failure resulted in the oil pressure in engine 2 dropping rapidly to 0 psi. The aircraft’s electronic centralised aircraft monitor (ECAM) detected the drop in oil pressure and notified the flight crew through the ENG 2 OIL LO PR failure alert. In response to the alert, the flight crew commenced, but did not complete, the associated procedure. In accordance with the procedure the flight crew reduced the engine’s thrust to idle, but then elected to monitor the engine instead of shutting it down. After about 4 minutes, the flight crew returned engine 2 to normal operations. Shortly thereafter, the engine surged a number of times and eventually failed. The flight crew completed the engine failure procedure, shutting the engine down, and initiated a diversion to Melbourne. During the diversion, the flight crew attempted to relight engine 2 twice, the first shortly after the engine failure, and the second just prior to descending into Melbourne.
What the ATSB found
The ENG 2 OIL LO PR failure, a level 3 alert, was the result of a shaft failure of the right engine oil pressure pump. A level 3 alert required immediate crew action as the failure may be altering the safety of the flight. The ECAM procedure required the flight crew reduce the engine thrust to idle and, ‘if [the] warning persists’, then shut the engine down. The flight crew probably interpreted this as a temporal requirement and not a continuation of the condition, as intended by Airbus. As a result, the flight crew continued to troubleshoot the failure. After monitoring the engine they established a belief that the warning was the result of a gauge failure. With a stated intent of further trouble shooting, the flight crew then increased the engine’s thrust. This led to the first engine stall and ultimately the engine failure.
Despite available guidance and cumulative evidence to the contrary, the flight crew determined that right engine was not damaged and could be restarted. Consequently, and contrary to the operator’s procedures, the flight crew made two attempts to restart this engine. Both restart attempts failed.
Also contrary to the operator’s procedures, the flight crew elected to divert to Melbourne following the engine failure, bypassing closer suitable aerodromes. This increased the time that the aircraft was operating in an elevated risk environment.
What's been done as a result
The operator restated the operational requirements concerning engine restarts and diversion decision making to flight crew. The operator also used the occurrence as the basis of a training package on response to engine failures, restarting failed engines and diversion decision making.
This occurrence demonstrates the importance of crews adhering to standard operating procedures. It also identifies the need for clarity in the construction of procedures, and that where there is not a need for immediate response, to look at the full contextual and available information before deciding on a plan of action.
At 2137 Eastern Standard Time on 16 August 2016, an AirAsia X Airbus A330-343X, registered 9M-XXD, departed from Sydney, New South Wales. The aircraft was performing the scheduled passenger service XAX221 to Kuala Lumpur, Malaysia. The flight crew consisted of the aircraft captain, who was the pilot monitoring (PM), and the first officer, who was the pilot flying (PF).
The flight’s operational flight plan stated that the flight was an extended range operations (ETOPS) flight, with a ‘maximum diversion time [in the event that one engine failed] in still air limited to 120 minutes’. The operational flight plan listed Alice Springs and Darwin as the only planned Australian ETOPS alternate aerodromes. The first ETOPS operating area was about 400 NM (740 km) outbound from Alice Springs, Northern Territory.
As the aircraft approached Alice Springs (Figure 1), the right engine’s (engine 2) oil pressure pump failed, and shortly thereafter, the engine failed. The aircraft descended and diverted to Melbourne. During the diversion, the flight crew attempted two restarts of the failed engine. The aircraft landed at Melbourne at 0159 on 17 August 2016.
Source: Google earth, modified by ATSB.
The following description of the occurrence will focus on the oil pressure pump failure and subsequent engine failure, the flight crew’s actions, including the diversion decision, and the two restart attempts. Information contained therein is derived from data recorded by the full authority digital engine control systems, data from the flight data recorder, air traffic control recordings and flight crew interviews.
The oil pressure pump failure and subsequent engine failure
As the aircraft was in cruise at flight level (FL) 380 and tracking towards Alice Springs, the engine 2 oil pressure pump failed as a result of the pump’s drive shaft failing, resulting in a rapid and total loss of engine oil pressure. The shaft failure occurred at 2343:20, while the aircraft was 240 NM (445 km) south-east of Alice Springs. Over the subsequent 7 seconds, the aircraft’s flight data recorder (FDR) recorded the engine 2 oil pressure drop from 90 to 0 psi ((1) and green line on Figure 2).
Legend from bottom: ‘Oil P’ oil pressure; ‘Oil Q’ oil quantity units; ‘Oil T’ oil temperature; ‘TLA’ thrust lever angle; ‘N1’,’N2’, ‘N3’ refer to low-pressure, intermediate-pressure and high-pressure (respectively) engine fan speeds.
While no oil was being fed into the engine as a result of the oil pressure pump shaft failure, the air pressure in the bearing chambers would have forced the residual oil in the chambers down the scavenge lines and back to the tank. This resulted in the indicated oil quantity increasing by about one third ((2) and purple line on Figure 2).
ENG 2 OIL LO PR message
The loss of oil pressure was detected by the electronic centralised aircraft monitor (ECAM). At 2343:33, the ECAM alerted the flight crew by:
- triggering the ’master warning’ lights, located on the glareshield panel (Figure 3), and associated aural warning alert
- displaying the level 3 red warning alert ‘ENG 2 OIL LO PR’ message and associated emergency procedure on the engine/warning display (Figure 3)
- displaying the engine schematics on the system display (Figure 3).
Included within the engine schematics on the system display were the oil system parameters.
In response to the ECAM alert, the captain took over duty as the PF, and at 2343:47, the engine 2 thrust lever was retarded to idle ((3) and blue line on Figure 2). The flight crew reported that, after retarding the thrust lever, the emergency procedure required the flight crew to monitor the engine and to shut the engine down if the problem persisted. While the warning persisted after the thrust lever was retarded, the flight crew stated that all other engine indications were normal. Specifically, while the crew could see that the oil pressure was indicating zero, there was still oil quantity. The flight crew recalled that, as this was the only abnormal indication, they were reluctant to shut engine 2 down. This also led them to believe that the fault might be a false warning from the oil pressure indicator.
The first engine stall
About 3.5 minutes after retarding the thrust lever to idle, at 2347:14, the flight crew advanced the thrust lever for engine 2 to the CL position (see (4) and blue line at Figure 2). The flight crew stated that this was done with the intent of checking/troubleshooting the engine. Approximately 40 seconds later, at 2347:55, engine 2 stalled and began to run down ((5) on Figure 2). In an immediate response to the stall, the engine’s full authority digital engine control (FADEC) briefly cut the fuel flow to the engine, enabling the engine’s airflow to return to normal. The stall was accompanied by a significant spike in recorded engine vibration.
The ECAM detected the stall at 2347:57, and alerted the flight crew by:
- triggering the ’master caution’ lights and associated caution aural alert
- displaying the level 2 amber ECAM message ‘ENG 2 STALL’ with its associated abnormal procedure on the engine/warning display
- displaying the engine schematics on the system display.
The flight crew responded to the ECAM alert by retarding the thrust lever to idle ((6) and blue line on Figure 2). The engine parameters stabilised at an idle setting.
Shortly thereafter, at 2348:02, the flight crew declared a PAN PAN to air traffic control (ATC), stating that they had experienced an engine stall and requesting descent to FL 250. In communications with ATC over the following 30 seconds, the flight crew stated that they were ‘breaking off the airway doing a left turn’ and declared a probable intention to divert to Melbourne.
The second engine stall
At 2348:37, 35 seconds after the first engine stall, the engine stalled again and ran down further ((7) on Figure 2). The second engine stall was also accompanied by a significant spike in the recorded engine vibration. The FADEC again responded by briefly cutting the fuel flow, however, this time the engine did not recover. The engine continued to run down, and failed. The ECAM detected the engine failure and at 2348:42, alerted the flight crew by:
- again, triggering the master caution lights and associated caution aural alert
- displaying the level 2 amber ‘ENG 2 FAIL’ ECAM message, with its associated abnormal procedure on the engine/warning display
- displaying the engine schematics on the system display.
The flight crew responded to the ENG 2 FAIL ECAM alert by commencing the displayed procedure. That procedure included a decision about whether the engine was damaged. The flight crew stated that they consulted the flight manuals and determined that the engine was not damaged. In accordance with the required procedure, the engine master switch was selected to off at 2348:50, shutting the engine down ((8) on Figure 2).
Flight crew’s recollection of the engine stall/failure
The flight crew later reported that, coincident with the stall, they heard a slight bang from engine 2. The flight crew reported that, at the same time, they observed the ENG 2 STALL ECAM message, which was almost immediately replaced by the ENG 2 FAIL message. The ENG 2 FAIL message was coincident with the engine failing.
Actions following the engine failure
The following communications between XAX221 and Melbourne ATC immediately following the second engine stall and subsequent failure were relevant:
- At 2353 the flight crew advised ATC that the intention was to divert to Melbourne.
- At 2355 ATC re-cleared the aircraft to track direct to position ARBEY and then to Melbourne, and to descend to FL 250.
- At 2356 the flight crew called the operator’s maintenance support using Satcom, reporting that engine 2 had been shut down due to low oil pressure followed by an engine stall. The flight crew requested advice regarding the preferred diversion destination for either Adelaide or Melbourne. Maintenance support advised of a preference for Melbourne due to technical support concerns with Adelaide, but that the decision was the aircraft captain’s.
- At 2358 ATC requested the flight crew confirm the nature of the situation. The flight crew responded, stating:
…the situation now is we have the number two engine oil, that pressure is zero then [unintelligible]. Then after that is engine stall which we shut down the engine. Then at the moment we are flying on single engine before we are able to double check for engine start, then our decision is to proceed to Melbourne sir.
- At 0019 ATC requested the flight crew advise if there was visible damage or evidence of fire. The flight crew reported that there was no damage, just that low oil indication led to an engine stall. In subsequent communications, the flight crew requested advice on the runway in use at Adelaide, and whether Adelaide had a curfew. ATC advised that the curfew in Adelaide was in force, but that if the flight crew declared an emergency the curfew would be waived. The flight crew responded that their intention was to continue to Melbourne. There were no further communications between the flight crew and ATC about Adelaide.
The diversion decision
Relevant company nominated alternate aerodromes available for diversion at the time that the flight crew declared the PAN PAN, and the distance to those aerodromes, were:
- Alice Springs, about 205 NM (380 km)
- Adelaide, about 545 NM (1,009 km)
- Melbourne, about 815 NM (1,509 km).
The captain stated that the initial diversion decision was to go to Melbourne—Alice Springs was closest but discounted as the emergency was considered to be controlled. This decision was based on the understanding that the track took the aircraft close to Adelaide, which would allow for a diversion to Adelaide if conditions deteriorated.
The captain stated that the flight crew then reviewed the decision using the company’s integrated decision-making model. As part of that process, the captain reported that:
- in terms of safety, the emergency was controlled and the engine secured
- weather at both Adelaide and Melbourne was good, although Melbourne had an indication of moderate turbulence
- Adelaide was subject to curfew, but that ATC later advised it was ready to accept the aircraft
- rescue and fire fighting services at Adelaide were below that required for the company’s operations
- from a passenger wellbeing perspective, Melbourne was preferred
- the company had a station in Melbourne, which would enable easier aircraft recovery.
The first officer later stated that Alice Springs was not considered due to the it being an uncontrolled airfield that used pilot activated lighting. The captain reported that the flight crew calculated an equal time point between Adelaide and Melbourne. When the aircraft arrived at this point, they decided to continue to Melbourne for better recovery of the aircraft and passengers. The flight crew also reported there was concern regarding the Melbourne weather forecast due to the turbulence, although this concern was alleviated when later weather updates identified the Melbourne weather as good.
Attempted engine restarts
Two attempts to restart (relight) the failed engine were conducted during the diversion to Melbourne. The flight crew later reported that the intent to relight the failed engine was based on the ‘engine fail’ (ENG FAIL) procedure, which instructed the flight crew to consider a relight provided the engine was not damaged. The flight crew reported that, after working through the quick reference handbook and flight crew operating manual, they determined that the engine was not damaged.
At 0002:14 on 17 August, about 13 minutes after shutting the engine down, the flight crew attempted to relight engine 2. The following engine parameters were recorded immediately preceding the relight attempt:
- N1 was indicating a stable 23 per cent rotation
- N2 was indicating a stable 7 per cent rotation
- N3 was indicating 0 per cent rotation.
The relight attempt commenced when the engine master switch was selected to ON at 0002:15, and ceased at 0003:38 when the switch was selected to OFF. During the relight attempt, the aircraft was slowly descending from FL 239 to FL 232 with the airspeed slowly increasing from 258 kt to 280 kt. The attempted relight was unsuccessful. During the relight attempt, the ‘ENG 2 START FAULT’ ECAM message was displayed.
At 0132:00, just before commencing descent into Melbourne, a second relight was attempted by the flight crew. During this relight attempt, the aircraft was at FL 190 and the airspeed about 315 kt. The following engine parameters were recorded immediately before the relight attempt commenced:
- N1 was stable at about 25 per cent rotation
- N2 stable at about 6 per cent rotation
- N3 stable at 0 per cent rotation.
The flight data identified that both relight attempts were starter motor assisted relights. For the second relight, there was a 13 second delay from the initiation of the relight until the first indication of rotation of N3. A further 12 seconds later, at 0132:25, N3 achieved sufficient rotation for fuel to be introduced into the engine. At this time, N1 remained at 25 per cent, N2 had increased to 12 per cent and N3 had increased to 25 per cent. At 0132:36 a successful relight occurred, however, N1 remained at 25 per cent, N2 had increased to about 22 per cent, and N3 had stabilised at about 43 per cent. The flight crew later reported that, during the relight, vibrations were felt from the engine. As a result of this vibration, at 0132:46, the flight crew ceased the relight attempt and shut down engine 2.
- Eastern Standard Time (EST): Coordinated Universal Time (UTC) + 10 hours.
- An Airbus A330 is a twin-engine aircraft.
- Pilot Flying (PF) and Pilot Monitoring (PM): procedurally assigned roles with specifically assigned duties at specific stages of a flight. The PF does most of the flying, except in defined circumstances; such as planning for descent, approach and landing. The PM carries out support duties and monitors the PF’s actions and the aircraft’s flight path.
- Many of the documents and source material used in this report use the term ETOPS and EDTO (extended diversion time operations) interchangeably. As most of the regulatory sources applicable to this operation used ETOPS, this report will use ETOPS when describing principles that apply to either ETOPS or EDTO.
- See Aerodrome information for definition of an alternate aerodrome.
- The aircraft engines’ FADEC systems record a significant amount of engine data that is not recorded by the aircraft flight data recorder.
- Flight level: at altitudes above 10,000 ft in Australia, an aircraft’s height above mean sea level is referred to as a flight level (FL). FL 380 equates to 38,000 ft.
- This position sets the thrust limit for the engine electronic control.
- A stall in a turbine engine refers to a compressor stall. It is abnormal airflow resulting from the aerodynamic stall of aerofoils (compressor blades) within the compressor. Steady flow through the stages of a compressor occurs within a relatively narrow band of conditions. If the conditions inside a compressor go outside of this band due to an operating condition or a disturbance, the flow around the blades can break down in a manner known as a stall. In this instance, the blades would no longer effectively compress the air. If the breakdown of flow in a compressor stall is significant enough, the pressure change within the engine could be sufficient to reverse the flow through the compressor in a phenomenon known as a ‘surge’. A surge is often associated with a loud bang, or series of bangs, that can be heard in the aircraft.
- For further information concerning the FADEC system, see The Trent 700 engine.
- An internationally recognised radio call announcing an urgency condition, which concerns the safety of an aircraft or its occupants, but where the flight crew does not require immediate assistance.
- ARBEY was a navigation point at the commencement of the standard arrival route into Melbourne from the north.
- Pilot activated lighting is a runway and taxiway lighting system that is activated by a series of timed transmissions made by flight crew using the aircraft’s very high frequency radio on a designated frequency. Alice Springs did not have pilot activated lighting, but instead operated with the runway lights on at night.
- A point along track where the time to fly to Adelaide is equal to the time to fly to Melbourne.
- See The Trent 700 engine.
AirAsia X is a Malaysian company based in Kuala Lumpur, Malaysia, operating under an Air Operator’s certificate issued by the Department of Civil Aviation Malaysia. AirAsia X operates long haul air transportation services throughout the Asia-Pacific region and the Middle East.
The captain held an Air Transport Pilot (Aeroplane) Licence (ATP(A)L), a current Class 1 medical certificate, and was certified in English proficiency at level 5. The captain had accumulated about 8,700 hours of aeronautical experience. Of these, approximately 2,540 hours were on Airbus A330 type aircraft. In the 90 days preceding the occurrence, the captain had logged 244 hours, all of which were on Airbus A330 aircraft.
About 4 months prior to the occurrence, the captain had a recurrent training session in an A330 simulator, and about 10 months prior to the occurrence an annual line check, both of which were completed to a satisfactory standard.
The first officer held an Air Transport Pilot (Aeroplane) Licence (ATP(A)L), a current Class 1 medical certificate, and was certified in English proficiency at level 5. The first officer had accumulated about 3,265 hours of aeronautical experience. About 4 months prior to the occurrence, the first officer had a recurrent training session in an A330 simulator, and about 7 months prior to the occurrence an annual line check, both of which were completed to a satisfactory standard.
The A330 is a twin turbofan engine, medium to long range, wide-body passenger aircraft. Manufactured by Airbus, 9M-XXD was fitted with two Rolls-Royce Trent 700 series engines. The following discussion will cover the:
- electronic centralised aircraft monitor (ECAM)
- Trent 700 engine, including the engine oil system and starter system.
The electronic centralised aircraft monitor
The ECAM monitors the various aircraft systems and displays information about those systems, including the aircraft’s engines, to the flight crew. The components of the ECAM include the:
- flight warning computers (FWC)
- engine/warning and the system display units (Figure 3).
The figure shows the location and exploded view of the engine/warning display and system display, and the master warning/caution lights.
Source: Airbus, modified by ATSB.
When the FWCs detect a system failure, they automatically trigger the appropriate ECAM alert level. That ECAM alert level will result in the display of the ECAM message attached with the condition, the triggering of the alert level’s aural and visual attention-getters, and the display of the required emergency/abnormal procedure as well as the relevant system display.
There were three ECAM alert levels. From a systems perspective:
- The most serious, a level 3 red safety priority alert, denoted a system failure that alters flight safety and required immediate flight crew action.
- A level 2 amber abnormal priority alert denoted a system failure that does not have a direct consequence on flight safety, but required crew awareness. Action in response to a level 2 alert should be taken without delay, time and situation permitting. The required action is displayed as a procedure on the lower left section of the engine/warning display (Figure 3).
- A level 1 amber degradation priority alert required crew awareness and then monitoring.
The Trent 700 engine
The Rolls-Royce Trent 700 engine has three compressor/turbine assemblies, identified as the low-pressure, intermediate-pressure and high-pressure assemblies (Figure 4). The measure of the rotation of these assemblies is displayed as the N1, N2 and N3 values respectively (see the engine/warning and system displays at Figure 3). The high-pressure assembly has an accessory gearbox attached. That gearbox includes components used to start the engine. A full authority digital engine control (FADEC) system controls and manages the engine, as well as provides engine parameters to the ECAM system.
The cutaway identifies the low-pressure assembly (blue), the intermediate-pressure assembly (yellow), the high-pressure assembly (orange) and the combustion chamber (red).
Source: Rolls-Royce, modified by ATSB.
The engine oil system
The engine’s oil system provides lubrication for engine components. A pressure pump module takes oil from the oil tank and supplies that oil to the engine components at the required pressure. That oil is then returned from the engine components to the oil tank using a scavenge pump module. The oil system uses three sensors to monitor oil pressure, two oil pressure transducers and an oil pressure switch. The transducers provide oil pressure parameters to the ECAM system, which in turn, are displayed as oil system parameters on the relevant engine systems display. Those transducer parameters are also sent to the FWC. The oil pressure switch provides a signal directly to the FWC in the event of loss of oil pressure. The FWC will generate a low oil pressure message in the event that any two of the three oil pressure sensors indicate low oil pressure.
The FCOM contained a section that detailed the limitations contained within the manufacturer’s Aircraft Flight Manual. Limitations attached to the aircraft’s power plant included a minimum oil pressure of 25 psi.
In a Flight Operations Briefing Notes (FOBN) titled Supplementary Techniques – Handling Engine Malfunctions, Airbus provided basic guidelines to identify engine malfunctions and typical operational recommendations in case of engine malfunctions. The FOBN included the following with respect to low oil pressure and low oil level:
In service experience shows that some rejected takeoffs and in-flight shutdowns have been commanded because of a low oil level. However, a low oil level alone is not a symptom of an engine malfunction.
On the other hand, a low oil pressure is the sign of an imminent engine failure. Therefore, the published procedure must be applied.
The engine starter system
The engine starter system comprises an air turbine starter (starter motor), a starter valve to provide high pressure bleed air to the starter motor, and engine start controls on the flight deck panels. The FADEC controls the system components and the start sequence.
For an inflight relight of the engine, the FADEC will identify whether the relight will require the assistance of the starter motor, or whether the flight conditions are sufficient to enable a windmilling relight. Rolls-Royce advised that if airspeed is greater than 280 kt and N3 is greater than 7 per cent, then a windmill relight will be performed. If either of these two parameters are not met, the FADEC will perform a starter assisted relight. During an inflight relight, the FADEC will:
- provide fuel to the engine when the high-pressure turbine has achieved greater than 10 per cent rotation
- if the starter motor has been engaged, disengage the starter motor when the high-pressure turbine has achieved greater than 50 per cent rotation.
Rolls-Royce also advised that there was no published data on engine indications for a windmilling engine. The expected N1, N2 and N3 indications would depend on the aircraft’s speed and altitude, however, test data for conditions similar to the first relight attempt suggest that these parameters would be about N1 of 18 per cent, N2 of 6 per cent and N3 of 4 per cent.
Emergency and abnormal procedures
The triggering of an ECAM alert will result in the procedure, which the flight crew are required to complete, being presented on the engine/warning display. Some actions required by the procedures may depend on a precondition. These preconditions are identified by a preceding dot. Procedures can also conclude with associated procedures. Associated procedures are additional non-normal procedures required to be completed as a result of a change in the aircraft’s configuration or status.
The procedures discussion will examine the Airbus ECAM procedures for the ENG OIL LO PR, ENG STALL, ENG FAIL, and the related ENG SHUT DOWN messages. This will be followed by discussion on the LAND ASAP message, engine restart in-flight procedures and the ENG START FAULT alert.
The AirAsia X Flight Crew Operating Manual (FCOM) contained explanations on these procedures. The intent was to explain actions for which the reason is not self-evident, and to provide additional background. As far as practicable, the procedural presentation in the FCOM was identical to that presented by ECAM. The following discussion on specific emergency/abnormal procedures is based on what is displayed to the fight crew on the ECAM, and any other relevant information that is contained in the FCOM.
The AirAsia X Flight Crew Training Manual contained general guidance on the conduct of engine malfunctions. That guidance stated:
Most engine malfunctions are taken into account by one or several ECAM alerts that warn the flight crew and provide the flight crew with the actions to perform. However, some engine malfunctions require some knowledge and the analysis of the flight crew, so that the flight crew can recognize, understand, and manage these engine malfunctions.
When the flight crew identifies an abnormal parameter, the flight crew should use all the information available to analyze the engine malfunction. The flight crew should not consider only this abnormal parameter to perform their analysis.
If possible, the flight crew should keep the engine running in flight. Except if a procedure requires an engine shutdown, it is usually preferable to keep the engine running. Even at idle, the engine powers the hydraulic, electric, and bleed systems.
In addition, if the flight crew is not sure which engine has a malfunction, the flight crew should keep the engines running. If really damaged, the affected engine will eventually fail.
The ENG OIL LO PR ECAM procedures
If engine oil pressure drops below 25 psi, the ECAM will trigger the ENG 1(2) OIL LO PR (engine oil low pressure) level 3 alert, with an associated aural alert. The displayed ECAM procedure was:
THR LEVER (AFFECTED ENGINE).................................................IDLE
● IF WARNING PERSISTS:
ENG MASTER (AFFECTED ENGINE)..............................................OFF
Selecting the ENG MASTER switch to OFF will shut the engine down. If the flight crew select the ENG MASTER to OFF, the ECAM identified the ENG 1(2) SHUT DOWN (engine shutdown) procedure as an associated procedure.
The flight crew delayed completing the procedure while they analysed the oil system parameters and the likely reason for the alert. After about 3.5 minutes, the flight crew advanced the thrust lever, and 30 seconds later, the engine stalled.
The ENG STALL ECAM procedures
An engine stall will trigger a level 2 alert, which is notified to the flight crew by the ENG 1(2) STALL (engine stall) ECAM message and associated aural alert. The displayed ECAM procedure is:
THR LEVER 1(2)....................................................................................IDLE
ENG 1(2) PARAMETERS.......................................................................CHECK
● IF ABNORMAL:
ENG MASTER 1(2).................................................................................OFF
If the flight crew select the ENG MASTER to OFF, the engine shutdown procedure is identified as an associated procedure.
The FCOM guidance for the engine stall procedure provided additional detail on the indications of a stalled engine, and also stated that an engine restart is at the flight crew’s discretion.
The flight crew reacted to the stall condition in accordance with the procedure; however, about 30 seconds later the engine stalled again and then failed.
The ENG FAIL ECAM procedures
When the engine’s core speed is below idle while the engine MASTER switch is ON, the ECAM will trigger a level 2 alert with the ENG 1(2) FAIL (engine fail) ECAM message being displayed and associated aural alert. The displayed ECAM procedure is:
ENG START SEL.........................................................................................IGN
THR LEVER (AFFECTED ENGINE)...........................................................IDLE
● IF NO ENG RELIGHT AFTER 30 S:
ENG MASTER (AFFECTED ENGINE)........................................................OFF
● IF DAMAGE:
● IF NO DAMAGE:
ENG (AFFECTED) RELIGHT........................................................................CONSIDER
The procedure required the flight crew to determine whether the engine is damaged. The FCOM provided guidance on indications of damage to an engine. The FCOM procedure stated that:
Engine damage may be accompanied by:
● Significant increase in aircraft vibrations and/or buffeting
● Repeated, or uncontrollable engine stalls
● Associated abnormal indications, such as hydraulic fluid loss, no N2 or N3 indication.
The placing of the ENG MASTER switch to OFF would result in the engine being shut down. This would trigger the ENG 1(2) SHUT DOWN ECAM, which was also identified as an associated procedure. Appended to the end of the ENG 2 FAIL and part of the notice that the associated procedure was the ENG SHUT DOWN procedure, was the following additional information:
Apply the ENG SHUT DOWN procedure … if damage, or if engine relight is unsuccessful.
The procedure included a restart (relight) procedure, being the first procedural action—the selection of the ENG START SEL (the engine start selector switch) to the IGN (ignition position).The flight crew shut the engine down in accordance with the engine failure procedure.
The ENG SHUT DOWN ECAM procedures
When an engine was shut down, the amber LAND ASAP (land as soon as possible) and ENG 1(2) SHUT DOWN ECAM messages were displayed. The engine shut down procedure was designed to place the aircraft in a configuration that enabled single engine operation.
The shut down procedure does not include an option to, or guidance to consider, restarting the engine.
LAND ASAP message
The FCOM included definitions for the LAND ASAP ECAM message:
- LAND ASAP (red). Land as soon as possible at the nearest airport at which a safe landing can be made. Note: LAND ASAP (red) information is applicable to a time-critical situation.
- LAND ASAP (amber). Consider landing at the nearest suitable airport. The definition included a note, stating that the suitability criteria should be defined in accordance with the operator's policy.
Inflight engine relight
There was no ECAM displayed procedure for an in-flight engine relight. The FCOM and quick reference handbook (QRH) contained the ENG RELIGHT (IN FLIGHT) procedures.
Rolls-Royce advised that there is normally a delay of around 7 seconds between initiation of an inflight relight—through selection of the ENG MASTER switch to ON—and the first indication of rotation for N3.
The ENG START FAULT ECAM alert
The ENG 1(2) START FAULT level 2 alert was triggered when one of a number of conditions were detected during an engine start, including:
- exceeding the starter time
- an engine stall
- the engine’s exhaust gas temperature exceeds the limit
- no light up
- low N1 speed.
The ECAM procedure associated with the alert is not provided in this report, as it is not relevant for the event’s investigation.
The flight crew’s first attempted relight occurred about 13 minutes after the engine failed. The second relight occurred about 90 minutes later, just before the aircraft commenced its descent into Melbourne.
For the first relight, the airspeed and starter motor were unable to provide sufficient N3 rotation to trigger the FADEC to provide fuel to the engine. As a result, light up did not occur within the required time. This was the probable cause of the ENG START FAULT ECAM alert.
For the second relight, the airspeed and starter motor were able to provide sufficient N3 rotation to enable fuel flow and light off, but due to the deteriorated state of the engine the N3 did not increase sufficiently to enable disengagement of the starter motor until the flight crew stopped the relight through selecting the ENG MASTER switch to OFF.
Rescue and fire fighting services
The rescue and fire fighting services (RFFS) required for various aerodromes was established through the International Civil Aviation Organization (ICAO), 2016 Annex 14 to the Chicago Convention, titled Aerodrome Design and Operations. The principal objective of RFFS is to save lives in the event of an aircraft accident or incident occurring at, or in the immediate vicinity of an aerodrome. An aerodrome’s required RFFS was based on a number of criteria, including the overall length and the fuselage width of the aircraft that predominantly use the aerodrome, and a threshold value in the number of movements of the highest category of aircraft. The various levels of RFFS category (CAT) establish a requirement for, among other things, the types and number of rescue and firefighting equipment required to be available at that aerodrome. Essentially, the larger the aircraft the higher the required RFFS CAT. The highest category was CAT 10.
Annex 14 also included a standard requiring that all aerodromes provide RFFS. Australia had a registered difference against this standard. That difference stated that not all international alternate aerodromes had RFFS. That difference also listed eight aerodromes where this was the case. Alice Springs was not included within that list.
The AIP listed Alice Springs as a designated alternate airport to international airports. A designated alternate airport was defined as:
... an airport specified in the flight plan to which a flight may proceed when it becomes inadvisable to land at the airport of intended landing.
For the day of the occurrence, Alice Springs tower operated from 0830 to 1830 EST, with associated class C and D airspace being active. The occurrence happened outside of these hours, where the airspace at and below FL 180 became class G. RFFS was CAT 7 from 0745 to 1830. At the time of the occurrence, Alice Springs Tower was not manned and RFFS was not available. The NOTAMs contained in the OFP identified the correct RFFS availability.
Airservices Australia provided advice to the operator which stated that ‘outside the advertised operational hours RFF can be at Category 6 within 30 minutes notice in case of emergency’. This out of hours RFF support was provided by off-aerodrome municipal emergency services.
Neither pilot had operated into Alice Springs.
Adelaide operated a curfew from 2330 through 0630 EST. The aerodrome was available for nomination as a planned or unplanned alternate during the curfew. Specifically, curfew restrictions did not apply in a number of circumstances, including when the pilot of the aircraft had declared an in‑flight emergency, or when there was an urgent need for the aircraft to land to ensure the safety or security of the aircraft.
RFFS was CAT 9 from 0630 to 2330, and CAT 5 at other hours, however, on request and with one hour’s notice, RFFS could be raised to CAT 9. The NOTAMs contained in the OFP identified the correct RFFS availability.
The captain stated that he had operated into Adelaide twice and was familiar with that aerodrome, but was more familiar with Melbourne.
Melbourne did not operate a curfew. RFFS was CAT 10 with 24-hour availability.
The occurrence involved an engine failure while the aircraft was operating as an extended range operations (ETOPS) flight. An ETOPS segment commenced when the flight was more than the threshold time, 60 minutes at the engine out cruise speed, from an authorised ETOPS alternate aerodrome. The occurrence flight had a maximum authorised diversion time of 120 minutes, that is, the flight was approved to operate no further than 120 minutes from any authorised ETOPS alternate aerodrome. The flight’s operational flight plan (OFP) listed three alternate aerodromes. Alice Springs and Darwin were two of those alternate aerodromes. In-flight, the flight crew were able to use a number of approved ETOPS alternate aerodromes, including Adelaide and Melbourne, should they be required for use. At the time of the engine failure, the aircraft was approaching Alice Springs and, at the relevant ETOPS planning speed, was about:
- 30 minutes from Alice Springs
- 75 minutes from Adelaide
- 115 minutes from Melbourne.
The following discussion on operational considerations around the engine failure and subsequent actions includes:
- an examination of Malaysian and Australian legislation, regulation and/or standards relevant to:
- preflight planning requirements for international operations, and in particular alternate aerodrome requirements, for both normal operations and ETOPS
- in-flight requirements following an engine failure
- an overview of the operator’s operations manual suite
- the operational flight planning requirements as contained within the operator’s operations manual
- the inflight requirements as contained within the operator’s operations manual.
Malaysian and Australian regulatory requirements
The operator was registered in Malaysia and required to meet Malaysian regulations for the conduct of operations and the occurrence flight. The flight, and the occurrence, however, was over Australian territory and therefore Australian law applied. It is therefore necessary to consider the effect of both states’ legal and regulatory systems on the planning and conduct of the occurrence flight.
The Malaysian legislative and regulatory system that applied to civil aviation consisted of the Civil Aviation Act 1969 (Malaysia), the Civil Aviation Regulations 2016 (Malaysia) and subordinate rules, standards, requirements and procedures . The following points are relevant:
- An operator was required to submit an operations manual to the regulator for approval.
- The regulator was required to approve ETOPS operations. That approval included relevant procedures, the authorised ETOPS routes, and alternate aerodromes available to be used.
- The Malaysian regulatory system applied to Malaysian registered aircraft extra-territorially.
A Malaysian rule that came into effect shortly after the occurrence required an operator to ensure that an aerodrome of operation met the relevant RFFS category. The rule, however, enabled a Malaysian operator to use another state’s RFFS requirements when operating in that state’s flight information region, when those RFFS requirements differed from Malaysian requirements.
The Civil Aviation Act 1988 (Australia) required international aircraft operating into or from Australia to have permission from the Civil Aviation Safety Authority (CASA). At the time of the occurrence, AirAsia X was operating under a Foreign Aircraft Air Operator’s Certificate (FAAOC). Civil Aviation Order (CAO) 82.0 applied a number of conditions to Air Operator’s Certificates authorising, among other things, regular public transport operations. CASA advised that CAO 82.0 was not applicable to foreign registered aircraft. CASA also advised that, in accordance with the Chicago Convention, the AirAsia X’s operations manuals, as authorised by the Department of Civil Aviation Malaysia (DCAM), were accepted by CASA for the purposes of issuing the FAAOC.
The Operations Manual suite
The operations manual (OM) suite, which was approved by the DCAM, contained the operator’s policies, instructions and procedures necessary for flight operations. There were four parts to the suite:
- Part A General/Basic (OMA) that contained non-type related operational policies, instructions and procedures.
- Part B Aeroplane Operating Manual (OMB) that contained all of the type related airplane operating manuals.
- Part C Route and Airport (OMC) that contained route and airport information.
- Part D Training (OMD) that contained training related information.
The OMA contained ‘non-type related operational policies, instructions and procedures that are needed for a safe operation’. OMA Chapter 8 contained matters directly dealing with aircraft operations. That chapter contained a number of subsections, three of which were relevant for the occurrence event—flight preparation, flight procedures and ETOPS.
The OMC also contained material that supplemented the requirements in the OMA.
The flight preparation section included criteria for determining the usability of aerodromes. The usability of an aerodrome was based on whether that aerodrome met:
- certain physical and supporting infrastructure requirements, that is, it met the criteria for being an adequate airport
- weather requirements at the projected time of use, that is, being an adequate airport that aerodrome met additional weather criteria for being a suitable airport.
An adequate airport was one that:
- met the required runway characteristics, including sufficient length and performance requirements
- had sufficient supporting infrastructure, including lighting, communications and navigation aids
- had the recue and firefighting services (RFFS) required for the type of operation that it was to be used for
- the pavement strength was compatible for the aircraft weight.
The adequate airport criteria also contained a limitation regarding operations over remote areas. That limitation required that a two engine aircraft operating over remote areas to not be flown more than 60 minutes:
- from an adequate airport
- at the one engine inoperative cruising speed
- where weather conditions are forecast at or above the applicable landing minima at the expected time of arrival.
The exception to this limitation was when operating in accordance with ETOPS criteria.
The adequate airport criteria also contained the operator’s required RFFS categories for various types of operations with their A330 aircraft. The A330 had an RFFS requirement for CAT 9. That RFFS requirement could be reduced, based on the type of operation that a particular adequate airport was to be used for. These reduced CAT requirements were:
- CAT 8 for a departure or destination airport
- CAT 7 for an en route alternate airport
- CAT 4 for an ETOPS suitable airport.
Finally, the adequate airport requirements also included a statement that:
- ‘RFF[S] category required for ETOPS and Adequate alternates is 4’. This statement contradicted the CAT 7 requirement for an en route alternate airport. The operator advised that this statement should have read ‘RFF[S] category required for ETOPS Adequate alternates is 4’.
- if during flight the aircraft captain becomes aware of an RFFS category downgrade, the captain may either divert or elect to accept an RFFS of no lower than CAT 4 and continue the flight.
A suitable airport was a departure, destination or alternate airport that met specific weather criteria at the time of the operation. The reported weather conditions at Alice Springs, Adelaide and Melbourne met the requirements for a suitable airport.
The OMA section on flight procedures included policy and procedure for handling an abnormal/ emergency condition. Included within that material were the inflight requirements following an engine failure, which stated:
As a general rule if the reason for the engine failure cannot be clearly identified (ice, heavy rain, turbulence, etc.) then it shall not be restarted unless a greater emergency exists…
Two Engine Aircraft: The Commander/PIC shall: …
– Divert to the nearest suitable airport that is safe and operational.
– When two or more suitable airports are available then the nearest airport in flight time terms should be considered…
The following elements and others which may be relevant, should be considered, to determine whether an airport is suitable or not:
• Aircraft configuration and performance, current weight, systems status, required fuel versus usable fuel available, wind, weather, terrain, minimum altitudes, runway dimension, surface condition, braking devices available, navaids for approach and lighting available, RFF category at diversion airport and any injuries to any person or persons on board.
The Commander/PIC is expected to divert to the nearest suitable airport if operationally possible otherwise, he is expected to state his reason or reasons for the exceptional circumstances leading to him not to divert to the nearest suitable airport.
The two restart attempts by the flight crew appear to be contrary to the engine failure rule stated in the flight procedures section of the OMA.
Following the engine failure, the flight procedures policy required a diversion to the nearest suitable airport.
The flight preparation section of the OMA detailed the criteria for a suitable airport. Those criteria included a requirement for RFFS for an en route alternate to be CAT 7. The aircraft captain stated that the decision to divert to Melbourne was based, in part, on the RFFS available at Adelaide, notified as being CAT 5. This was below that required for an en route alternate.
The flight crew reported that two elements that supported the decision to divert to Melbourne were passenger wellbeing and easier recovery of the aircraft. Both elements have an apparent commercial nature, but neither commercial considerations nor these specific elements are included in the determination of the suitability of an airport for diversion.
The OMA separated out all material pertaining to ETOPS into its own section. This section was broken down into a number of sub-sections that covered topics including general information, ETOPS specific definitions, dispatch requirements, normal and non-normal procedures. Regarding the scope of ETOPS policy, the introduction to ETOPS included the following statement:
The policies contained in this section are to be applied over and above AAX’s Flight Operations Standard Operating Policy when operating any of the specified ETOPS routes.
The definitions sub-section contained definitions specifically applicable to ETOPS. For the purposes of the occurrence, the following were relevant:
- ETOPS operations. The definition stated the following:
ETOPS operations apply to all flights conducted in a twin-engine aircraft over a route that contains a point further than 60 minutes flying time from an ADEQUATE airport at the approved one-engine-out diversion cruise speed schedule in still air and ISA conditions.
- Adequate airport. Otherwise known as an ETOPS adequate airport, the definition identified a number of variations or additions to the adequate airport criteria, as stated in the flight preparations section. These included:
- the requirement for DCAM authorisation for an ETOPS alternate
- the setting aside of runway pavement requirements
- the minimum acceptable RFFS as being CAT 4
- that remote airports that have reduced or no RFFS capacity could be used, where the minimum capacity is met by municipal fire departments located off-airport. Alice Springs was specifically identified as being an example of this type of remote airport. The operator stated that Airservices Australia had advised that Alice Springs municipal fire services met the RFFS CAT 4 requirements.
- Suitable Airport. Otherwise known as an ETOPS suitable airport, there were variations in the weather requirements for a suitable airport, but these were not applicable for the occurrence.
- Maximum Diversion Time. This is the maximum flying time authorised from any point of the route to the nearest adequate airport. The definition included the following:
AAX is approved by DCAM for 120/180 minutes maximum diversion time. Refer to specific aircraft Operations Specifications (OPS SPECS).
It is only used for determining the area of operation, and therefore is not an operational time limitation for conducting a diversion, which has to cope with the prevailing weather conditions.
- Maximum Diversion Distance: This definition applied a standard one engine out diversion profile and reference weight, resulting in still air standard distances for various approved diversion times. The following were applicable:
- 60 minutes. Also referred to as the threshold time, the equivalent distance was 430 NM (see Figure 5 for the application of this threshold time for the occurrence flight).
- 120 minutes. The equivalent distance was 823 NM. The OFP stated that the occurrence flight was an ETOPS flight limited to 120 minutes.
The flight track of XAX221 is displayed, as are 430 NM range circles from the company preferred alternates of Melbourne, Adelaide and Alice Springs aerodromes. The positions of the ENG2 OIL LO PR warning and the two engine relight attempts are also shown.
Source: Google earth, modified by ATSB.
- Area of operation: Also known as the ETOPS area of operation, this was defined as follows:
The ETOPS Area of Operation is the area in which it is permitted to conduct a flight under the ETOPS regulations. It is defined by the declared maximum diversion distance from an adequate airport (or set of adequate airports), and is represented by the area enclosed within the circles centred on the selected adequate airports, the radius of which is the declared maximum diversion distance.
The OMA separately listed Australia as an ETOPS area of operations.
- ETOPS entry point (EEP). The point located on the aircraft’s outbound route, 60 minutes from the last adequate airport. It marks the beginning of the ETOPS segment.
- ETOPS exit point (EXP). The point located on the aircraft’s route, where the aircraft has been flying in the ETOPS segment, it enters an area within the threshold time to an adequate aerodrome. It marks the end of the ETOPS segment.
- ETOPS segment. Defined as follows:
The ETOPS segment (ETOPS area of operations) starts at the EEP and finishes at the exit point (EXP) when the flight path is back and remains within the 60 minutes area from an ADEQUATE airport. An ETOPS route can contain several successive ETOPS segments well separated from each other.
For a graphical presentation of the EEP, EXP and the ETOPS segment, see Figure 6.
Source: AirAsia X.
The dispatch requirements sub-section included guidance on the content and structure of an ETOPS OFP. As part of ETOPS dispatch requirements, the operator was required to keep a list of approved ETOPS routes with the corresponding en route alternates. The OM suite did not identify any specific ETOPS routes, but instead listed ETOPS areas of operations, which included Australia. The en route alternates were listed separately in the OMC, in a section titled Company Preferred Alternates.
The normal procedures sub-section included procedures relating to preflight cockpit preparation, taxiing, specific inflight procedures such as after airborne communications with dispatch, renomination of en route alternates, fuel and weather update requirements, and various procedures relating to whether the aircraft can meet the specific requirements for ETOPS flight prior to entering an ETOPS segment.
The non-normal procedures sub-section contained the following relevant content:
- The Airbus recommendations and guidelines for inflight re-routing or diversion decision making contained in the flight crew operating manual (FCOM) was referenced.
- A section titled Failure Cases Requiring a Diversion to the Nearest Airport stated:
In cases leading to a LAND ASAP message on ECAM or QRH, the crew are to follow the ECAM procedures and land at the nearest suitable airport. LAND ASAP in RED requires greater urgency than LAND ASAP in AMBER.
- There were also procedures applicable to various systems failures that contained specific guidance prior to the ETOPS segment and during the ETOPS segment.
The operator advised that, as the aircraft had not entered an ETOPS segment, ETOPS policy and procedures did not apply to this particular phase of the aircraft’s flight.
There are a number of items from the ETOPS subsection of the OMA which indicate that ETOPS policy and procedures did apply at the time of the engine failure:
Of note, the LAND ASAP ECAM procedure applied to ETOPS operations, not all operations.
Company preferred alternates
The OMC contained criteria used to select company preferred alternates. The selection criteria for a company preferred alternate included 24 hour operations (unless otherwise stated) and an RFFS of CAT 7 or better (CAT 4 or better for ETOPS).
The OMC also contained a list of preferred alternate aerodromes. The intent of the list was to assist pilots in the selection of an aerodrome for a safe landing when circumstances necessitated, such as a change from a planned en route alternate. The list of company preferred alternates included:
- Alice Springs, with a stated RFFS of CAT 6, and also an indication that additional information regarding operating time/hours applied
- Adelaide, with a stated RFFS of CAT 9 during published normal hours and CAT 5 at other times.
That list also included Melbourne, Avalon and Sydney. The OMC also contained the following statement:
There is nothing to prevent pilots from considering airports that are not listed for diversion. However it is the responsibility of the Commander to ensure that the aircraft performance requirements are met and the deviation from the prescribed criteria is justified under the circumstances.
The following summarises the METARs covering the period 2330 to 0330 for:
- Alice Springs, visibility was greater than 10 km, the wind light and variable, temperature around 12 °C and dew point 0 °C, with the QNH 1021.
- Adelaide, was CAVOK with the wind light and variable, temperature around 13 °C and dew point 10 °C, QNH 1021, with a trend indicating no significant change.
- Melbourne was CAVOK with a strong northerly wind of 20 kt but decreasing over the period, temperature around 13 °C and dew point 9 °C, QNH 1019. The METARs also included an observation of moderate to severe turbulence below 5,000 ft.
The ATIS for the aircraft’s arrival at Melbourne stated that the wind was 350 degrees at 16 kt, and the weather conditions were CAVOK.
The aircraft was fitted with a flight data recorder and cockpit voice recorder as required by regulation. The cockpit voice recorder was not obtained by the ATSB—the event would have been overwritten due to the length of flight. Additional data from the aircraft’s FADECs was obtained from the engine manufacturer.
Post flight examination of the engine
Examination by the engine manufacturer
Rolls-Royce conducted an investigation into the engine failure, which included an examination of the affected engine. That examination found the following:
- there were no obvious engine oil leaks
- oil tank contents were full
- the engine magnetic chip detectors and oil filters were clean
- the oil pressure pump drive shaft neck was fractured (Figure 7)
- the high pressure (HP) assembly was seized due to bearing stress
- damage to the HP assembly ball and roller bearings was consistent with engine operation without oil pressure.
On examination of the failed shaft, Rolls-Royce identified that the failure occurred at the shear neck, but that the failure was not typical of any previous shear neck failures (Figure 8). Rolls‑Royce found that the shaft fracture was the result of fatigue cracking that originated at multiple sites within the shaft bore. Those cracks grew over time, weakening the shaft, until the remaining material failed in overstress. The material and shaft dimensions reportedly complied with specifications. Rolls-Royce was not able to determine what generated the fatigue cracks in this pump shaft. Other than the fracture in the shaft, the pump was in a serviceable condition.
Note that the fracture surface is perpendicular to the shaft axis.
The Rolls-Royce investigation concluded that damage to the HP assembly bearings was consistent with engine operation without oil pressure. Rolls Royce also concluded that the surges were consistent with the expected behaviour of an engine with the identified damage to the HP bearings.
An analysis of the relight attempts was also completed. The analysis stated that during the:
- first relight attempt, the engine did not reach the FUEL ON condition, and therefore fuel flow remained at zero throughout the relight attempt.
- during the second relight attempt:
- at 0132:25, N3 exceeded the FUEL ON condition and fuel was introduced into the engine
- 11 seconds later the engine’s EGT began to increase, indicating light up
- N3 did not increase above the 50 per cent threshold required for starter disengagement, the highest recorded value being 43 per cent
- oil pressure for the engine remained at zero throughout both relight attempts.
Oil pressure pump history
Rolls-Royce reported that the oil pressure pump had completed 2,250 cycles and 13,597 hours since new. It had recently been removed from the engine and sent to the pump manufacturer for inspection and rebuild as part of checks for another engine issue. The pump shaft had been subjected to, and passed, non-destructive testing as part of the rebuild. Since being refitted to the engine, it had completed a further 20 hours and 6 flights. Rolls-Royce did not identify any significant events during the engine history that would have affected the oil pressure pump’s integrity.
The oil pressure pump model is common to both the Trent 700 and the Trent 800 (fitted to Boeing 777 aircraft). Rolls-Royce reported that in an initial review of in-service experience, it identified two previously reported cases of pump failure. The first was on a Trent 800, with approximately 3,000 hours since overhaul, and the second was on a Trent 700, with 7,730 hours since new. In both cases, the failure was a result of pump bearing seizure, and the fracture was perpendicular to the axis of the shaft, consistent with that shown in Figure 8.
In an article in the Airbus safety magazine Safety First, Airbus (2007) discussed the issue of trained flight crew not following procedures. Airbus identified that procedural compliance required not only good procedural design, but also appropriate explanations to support the procedure. Appropriate explanations ensure that flight crew have sufficient confidence in their skills and judgment to manage the situation. Airbus argued that mismatches between procedures (an instruction) and their implementation (an action) can be a function of human performance, which is not stable. Aside from factors such as fatigue, workload and stress, the implementation of procedures requires:
- understanding the situation
- understanding the procedure and its meaning
- ensuring that all pre-conditions are checked
- anticipating the expected results
- ensuring that all actions required by the procedures are performed in the right order, with good judgment and with good synchronisation between crewmembers.
Airbus concluded that no set of procedures can substitute for human intelligence and flight experience. Safety is a function of a safe aircraft, procedures, and flight crew competence as an ability to manage the expected and unexpected.
Ambiguity in procedural construction
As pointed out by Airbus, good procedural design is a critical component of the safety equation. A factor influencing checklist performance is ambiguity of terms listed (Degani and Weiner, 1993). Once the appropriate procedure has been found (de Britto 1998):
- the crew have to understand the content
- then plan the necessary actions
- and finally execute actions planned.
One type of understanding required is a literal understanding of the items. A literal understanding can be affected by ambiguity in the English language, particularly when the crew are from non-English speaking countries (de Britto, 1998; Burian, 2006).
To improve the effectiveness and clarity of technical documentation, the AeroSpace and Defence Industries Association of Europe published a guide on simplified English for use in technical publications (ASD-STE100). That guide included a dictionary that identified words that should be replaced with another word that provided a clearer meaning. The guide identified the verb ‘persist’ as one such word. It recommended that persist be replaced by ‘continue’. Airbus stated that it has an internal lexicon which is used to construct procedures displayed to flight crew through the ECAM. The operator stated that it did not hold a copy of that lexicon.
The ECAM procedure for the ENG 1(2) OIL LO PR contained the precondition ‘IF WARNING PERSISTS’ before shutting the engine down. ‘Persist’, however, could refer to either a:
- temporal sense, such that if the warning exists for a length of time, then the ENG MASTER switch is to be selected to OFF, however, the length of time was not specified in the procedure
- change in the way that the condition presents itself, requiring a response to a change or absence of change—specifically the ECAM message goes away and the associated alerts cease.
The flight crew appear to have interpreted the precondition as a temporal change, as they stated that:
- all other engine indications were normal after the thrust lever was retarded
- they monitored the engine and then some minutes later advanced the thrust lever to trouble shoot the alert.
Rolls-Royce Trent 700 engine operating instructions
The Rolls-Royce engine operating instructions identified that the ENG X OIL LO PR message was a level 3 alert that triggered in response to the indicated oil pressure being below 25 psid. The recommended procedure in flight was:
— THR LEVER X REDUCE
Reduce the thrust on the affected engine until the 'OIL LO PR' ECAM warning clears.
Continued operation of the affected engine is acceptable providing this thrust level is not exceeded.
If warning has not cleared even when the IDLE setting has been achieved.
— ENG MASTER X OFF
Airbus have used the following procedure in the A330 FCOM as an equivalent procedure to the Rolls-Royce procedure:
THR LEVER (AFFECTED ENGINE)...........................................................IDLE
● IF WARNING PERSISTS:
ENG MASTER (AFFECTED ENGINE)........................................................OFF
This would indicate that Airbus have intended the use of ‘persists’ to mean ‘if warning has not cleared’, however, common language usage of persists is in a temporal sense.
Airbus comments regarding the occurrence
Airbus provided the following comments regarding flight crew’s response to the ENG OIL LO PR alert:
The ASD-STE100 standard proposes to use the ‘continue’ instead of ‘persist’. Having discussed with our native English speaking test pilots, such wording may also be interpreted by flight crews as a temporal requirement, as it is a synonym of ‘persists’.
In the controlled language used by Airbus, the word ‘persist’ is authorized and has the following meaning : "To continue to exist especially past a usual, expected, or normal time."
The ENG 1 (2) OIL LO PR alert is a red warning. As stated in the FCOM, such configuration/failure requires an immediate crew action, as this system failure may be altering the flight safety. Even if the flight crew misinterpreted the procedure line to be a temporal condition and not a conditional one, monitoring the engine for more than 3 minutes cannot be considered to be an appropriate crew response to an emergency procedure.
The AAIB UK analysis of previous events where spurious OIL LO PR alerts were generated shows that the majority of flight crews will consider such alert as genuine and therefore will shutdown the engine.
Human performance related information
Errors are the result of actions that fail to generate the intended outcomes. The cognitive processes involved in achieving the goal provide a means of categorising error, and may relate to either the planning or execution of the activity. When there is a prior intention to act and the actions proceeded as planned, but the desired result is not achieved, that error is identified as a mistake. Reason (1990) stated that:
Mistakes are deficiencies or failures in the judgemental and/or inferential processes involved in the selection of an object or in the specification of the means to achieve it, irrespective of whether or not the actions directed by this decision-scheme run according to plan.
One type of mistake is known as a rule-based mistake, which involves the inappropriate matching of environmental signs to the situation and applying troubleshooting principles. Wickens and Hollands (2000) add that this type of mistake occurs when operators know, or believe they know the situation, and they invoke a rule or plan of action to deal with it.
While managing emergency or stressful situations, decision errors can occur. One example is when flight crew develop an incorrect interpretation of the situation, which leads to an inappropriate decision (Orasanu, 2010). Within these situations, demands on the flight crew increase (Woods and Patterson 2001). This is because:
- more knowledge potentially needs to be recalled
- there is more information to monitor and consequently, there is a changing set of data to integrate into a coherent assessment
- theories need to be generated and evaluated
- assessment may need to be revised as new data comes in
- actions to protect the safety of systems need to be identified, carried out, and monitored for success
- existing plans need to be modified or new plans formulated to cope with this process.
When evaluating a situation, there are often available cues that can actually leave the crew without a clear idea of the underlying problem (Orasanu, 2010). Various noises, thumps, vibrations, rumblings, pressure changes, or control problems indicate that something has happened, but not necessarily what. The cues signal potentially dangerous conditions that trigger emergency responses, regardless of the source of the problem. In addition, when experiencing stress and high workload, crews are vulnerable to missing important cues related to their situation and are likely to experience difficulty pulling together disparate pieces of information and making sense of them. This is especially true when some of that information is incomplete, ambiguous, or contradictory (Burian, Barshi, and Dismukes, 2005).
In addition, there is the potential effect of confirmation bias. This is a type of bias that describes a tendency for people to seek information and cues that confirm the tentatively held hypothesis or belief, and not seek (or discount) those that support an opposite conclusion or belief (Wickens and Hollands 2000). Incorrectly interpreting cues due to confirmation bias can strengthen that bias, leading to further cues not being correctly interpreted.
The operator’s internal safety investigation process
Safety Management Systems
The International Civil Aviation Organization (ICAO) Annex 19 to the Chicago Convention, Safety Management, defines a safety management system (SMS) as ‘a systematic approach to managing safety, including the necessary organizational structures, accountabilities, policies and procedures’. The implementation framework for an SMS needs to include the conduct of safety investigations. In alignment with ICAO standards, r. 167 of the Civil Aviation Regulations Malaysia requires that service providers, including air operator certificate holders, shall establish a SMS.
One of the key SMS references documented in the operator’s SMS Manual was the Civil Aviation Authority of Singapore’s Advisory Circular AC-1-3(5), Safety Management Systems. This AC advised that organisations should ‘show that the investigation identifies contributing or causal factors, identifies and ensures the implementation of necessary corrective actions’ and that identified controls are implemented.
Purpose of a safety investigation with a systemic approach
The latest edition of ICAO Doc 9858 Safety Management Manual, released in 2018, states that:
The primary objective of the service provider safety investigation is to understand what happened, and how to prevent similar situations from occurring in the future by eliminating or mitigating safety deficiencies. This is achieved through careful and methodical examination of the event and applying the lessons learned to reduce the probability and/or consequence of future recurrences.
ICAO also advocates for a no-blame approach to investigations:
The safety investigation should focus on the identified hazards and safety risks and opportunities for improvement, not on blame or punishment.
ICAO further promote that safety provider’s investigation reports should include clearly defined findings and recommendations designed to eliminate or mitigate identified safety deficiencies.
The operator’s processes and approach to safety investigations
The operator's SMS policy stated that:
Investigations are undertaken to help identify areas of safety deficiencies. When reports are submitted, an investigation process takes place to discover the details of the occurrence. It is through this process that safety deficiencies can be identified and corrected.
The stated objectives of an investigation included not only establishing what happened, but to identify the local conditions and organisational factors that contributed to the occurrence, review the adequacy of existing system controls and barriers, and to formulate recommendations and lessons learned. The objectives also outlined that the operator took a no-blame approach to investigations.
The documented means for AirAsia X investigators to conduct their analysis included the AirAsia Systemic Investigation Analysis (ASIA) method, which was outlined in a Quick Reference Guide. The ASIA process was described as follows:
ASIA is a process for conducting a systemic analysis of data collected during an incident or accident investigation, and for summarising and reporting this information using a structured framework and standard terminology.
ASIA aims to broaden the spotlight from the errors of individuals and to identify factors at all levels of the organisation or broader system that contributed to the safety event. Correct application of the ASIA method will identify systemic safety deficiencies and guide the generation of effective recommendations to prevent recurrence of similar events.
AirAsia X SMS guidance material stated that the ASIA method was based on the Reason accident causation model.
AirAsia X internal investigation report
The operator conducted an internal investigation into the occurrence.
A preliminary report from this investigation, released on 12 September 2016, contained a number of findings, including that the flight crew performed the ‘correct procedure’ by shutting down engine 2 after it had failed, and that the crew perceived that all engine indications and oil quantity were as indicating normal after engine shutdown. On the topic of the engine re-light attempts, the report stated that the flight crew referred to the FCOM, but were unsuccessful with each of the two attempts. With respect to the diversion to Melbourne, the report indicated that Melbourne was more favourable due to considerations around company RFFS requirements, Adelaide being under curfew, and passenger well-being (among others).
There were three safety recommendations, predominantly around needing to share the event with flight crew and the training department for enhancing current training programs. The operator published an update to the report in May 2018. The operator later advised that this updated report was their final report on the occurrence. This report restated the safety recommendations to share the event with flight crew and the training department for enhancing current training programs. It also contained two further recommendations, which concerned the issuing of a flight safety notice and a training bulletin designed to re-emphasised the inflight engine shutdown procedures and policies. The final report also included a finding that the decision by the flight crew to divert to Melbourne was not in accordance with the company policy that requires a landing at the nearest suitable airport when an amber LAND ASAP is displayed.
ATSB investigation AO-2007-035
On 15 August 2007, an Airbus A330-300 aircraft, registered PK-GPF, was about 926 km north-west of Sydney, NSW en route to Denpasar, Indonesia when the right engine low oil pressure warning activated. The flight crew shut down the engine and commenced a descent to FL 240 while they advised ATC of the event and requested a clearance to return to Sydney. ATC cleared the crew to descend and to track directly to Sydney.
The ATSB found that it was probable that the flexible oil pressure transmitter tube fractured as a result of fatigue from the tube not being adequately supported while being subjected to high levels of vibration. As a result of the crack, engine oil leaked to atmosphere, activating the right engine low oil pressure warning. There had been a history of this cracking and the engine manufacturer, Rolls-Royce, had issued a service bulletin to provide additional support for the tube. This had not been incorporated into the occurrence aircraft.
The ATSB investigation report also noted that another service bulletin had been released in 2002 by the engine manufacturer to alleviate false low oil pressure warnings and fluctuating oil pressure readings. These false readings were attributed to intermittent electrical signals resulting from wear of the electrical contact pins on the oil pressure transmitter connectors.
AAIB UK Bulletin 9/2013 Airbus A330-343, G-VKSS 19 January 2013
On 19 January 2013, an Airbus A330, registered G-VKSS, was in the initial climb and passing 530 ft above ground level when it was struck by birds. The birds impacted the fan blades of both the left and right engines. Both engines were damaged, resulting in significant vibration. The left engine was shut down by the crew following an ENG 1 OIL LO PR ECAM message and the engine oil pressure indicated zero. The aircraft returned to the departure aerodrome.
The aircraft was fitted with Trent 700 engines. No defects were identified with the engine oil system. The left engine oil pressure indication was the result of the high engine vibration causing transient negative oil pressure. The oil pressure transducers detected the transient negative oil pressure. When combined with the engine’s electronic controller logic, this caused the system to generate a low oil pressure warning that was locked on until the engine was shut down.
Rolls-Royce advised that there had been seven previous events of high vibration resulting in the generation of a low oil pressure message. Of these, five had resulted in a precautionary shutdown. However, no details were provided as to why the other two did not result in a shutdown.
In 2018 Rolls-Royce introduced a number of modifications to address the issue of ‘Low Oil Pressure’ warnings being spuriously triggered due to high engine vibration. A new oil pressure transmitter with an electronic filter to dampen high frequency/high amplitude measurements from its input was introduced in a January 2018 engine service bulletin. The engine manufacture also determined that the oil pump failure detection logic in the electronic engine controller software was no longer required. This software function was disabled through an October 2018 engine service bulletin.
- The Aircraft Flight Manual is a certification document attached to the certification of a given aircraft model. It is not an operational document. The FCOM is designed for use by flight crews.
- Available at https://skybrary.aero/bookshelf/books/193.pdf, the FOBN series is designed to provide an overview of specific standards, flying techniques and best practices, with the intent of enhancing safety awareness.
- The Flight Crew Training Manual was an Airbus document fully adopted by the operator.
- And the FIRE push button has not been pressed.
- ICAO Annexes are reliant on member State’s promulgating their content in national laws, regulation, standards and procedures, and implementing that content.
- Class C airspace is the controlled airspace surrounding major airports. Class D airspace is the controlled airspace that surrounds general aviation and regional airports equipped with a control tower. Uncontrolled airspace is designated as class G.
- A notice to airmen (NOTAM) is a notice containing information concerning the establishment, condition or change in any aeronautical facility, service, procedure or hazard, the timely knowledge of which is essential to personnel concerned with flight operations.
- For background information on ETOPS see Appendix 2: Extended range operations for two-engine turbine aircraft.
- In 2018, DCA Malaysia transitioned to the Civil Aviation Authority of Malaysia.
- Particular aircraft types have minimum RFFS category requirements, which in turn determines aerodromes they are able to use.
- The basis of the removal of runway pavement requirements was that an aircraft in an emergency is not required to consider runway pavement limitations. The definition, however, identified specific guidance that limited the application of this rule.
- METAR: a routine aerodrome weather report issued at routine times, hourly or half-hourly.
- Dewpoint: the temperature at which water vapour in the air starts to condense as the air cools. It is used, among other things, to monitor the risk of aircraft carburettor icing or the likelihood of fog.
- QNH: the altimeter barometric pressure subscale setting used to indicate the height above mean seal level.
- Ceiling and visibility okay (CAVOK): visibility, cloud and present weather are better than prescribed conditions. For an aerodrome weather report, those conditions are visibility 10 km or more, no significant cloud below 5,000 ft, no cumulonimbus cloud and no other significant weather [see AIP GEN 3.5 – Meteorological services, Section 4 Meteorological reports, paragraph 4.4.1, subparagraph g; and Section 12 Aerodrome weather and forecast decode, paragraph 12.13 CAVOK].
- Automated terminal information service, a continuous broadcast of recorded non-control information in selected high-activity terminal areas.
- A design function, where the diameter section of the shaft is reduced to provide a weak point on the shaft. That weak point enables a controlled site for failure should the pump shaft seize, thereby providing protection from further damage to the engine.
- A minimum N3 speed required before fuel will be introduced into the engine.
- Exhaust-gas temperature, measured immediately downstream of the turbines[s].
- One complete sequence of events making up a portion of the life of the engine. A cycle normal commences with the start-up and concludes with the shutdown of the engine.
- See Emergency and abnormal procedures.
- Pounds per square inch differential.
- See AR-2007-053, Chapter 3.3.at: www.atsb.gov.au/publications/2008/ar2007053/.
While en route from Sydney, NSW to Kuala Lumpur, Malaysia, the oil pressure pump drive shaft for the right engine (engine 2) of an AirAsia X Airbus A330 failed. That shaft failure resulted in the oil pressure in engine 2 dropping rapidly to 0 psi. The aircraft’s electronic centralised aircraft monitor (ECAM) detected the drop in oil pressure and notified the flight crew through the ENG 2 OIL LO PR alert and associated warning signals. In response to the alert, the flight crew reduced the engine’s thrust to idle in accordance with the displayed procedure, but then elected to monitor the engine instead of shutting it down as intended by the procedure. After almost four minutes, the flight crew returned the engine 2 thrust lever to the normal inflight position, resulting in the engine’s thrust increasing. Shortly thereafter, the engine surged a number of times and eventually failed. The flight crew completed the engine failure procedure, shutting the engine down, and initiated a diversion to Melbourne. During the diversion, the flight crew attempted to relight engine 2 twice, the first shortly after the engine failure, and the second just prior to descending into Melbourne. Rolls Royce determined that the drive shaft failed due to fatigue cracking, but this was an unusual failure that had not been observed previously.
This analysis will examine the flight crew’s:
- response to the oil pressure alert and the subsequent engine failure
- attempted relights of the failed engine
- diversion decision.
The analysis will conclude with a discussion on the internal investigation conducted by the operator.
Oil pressure alert and subsequent engine failure
The oil pressure pump shaft failure resulted in a level 3 failure red alert, and the ECAM message ENG 2 LO OIL PR and associated procedure being displayed to the flight crew. A level 3 alert is associated with a configuration or failure that requires an immediate crew action, as this configuration or failure may alter the safety of flight. After moving the thrust lever to idle, the procedure included a precondition for initiating the next and final step. That precondition required the flight crew to determine whether the ‘warning persists’, and if it did, then to shut the engine down. The flight crew probably interpreted the precondition as a temporal and not a conditional requirement:
- temporal, in that any further action be delayed, an interpretation that could be supported by the Flight Crew Training Manual (FCTM) guidance that urged a bias towards deferring any action that will result in shutting an engine down, and to look beyond the abnormal parameter
- conditional, as is the intent of the Rolls-Royce guidance to Airbus on the procedural design.
The flight crew elected to leave the engine at idle, and instead undertook further analysis of the engine indications. The flight crew’s actions point to two human performance issues in the conduct of the checklist:
- potential ambiguity in the checklist’s construction
- error in the flight crew’s performance, as a result of their mistaken belief as to the source of the level 3 alert.
Ambiguity in the checklist language
The engine manufacturer’s engine operating instructions clearly identified that the procedural response to the OIL LO PR message required the engine to be shut down if the message had not cleared when the idle setting had been achieved. The use of the word ‘persists’ in the precondition for completing the procedure in the Airbus ECAM procedure introduced the possibility of misinterpretation of the required precondition by the flight crew.
Effective communication, which includes all transfer of information whether spoken or written, is essential for the safe operation of flight. The quality and effectiveness of communication is determined by its intelligibility, that is, the degree to which the receiver understands the intended message. Individual words can have multiple meanings, while sentence construction and context can further complicate understanding.
While English is the international language of the aerospace industry, English is often not the native language of flight crew. An example of an attempt by an industry organisation to promote clarity in the use of English in technical documents is the ADS-STE100. That document highlights the potential ambiguity of the verb ‘persists’. Airbus have also identified that an internal lexicon is used in the construction of ECAM language. That lexicon does not appear to be publicly available.
The flight crew’s action in monitoring the engine parameters after partially completing the required procedure is a strong indicator of a misunderstanding in the language used for the precondition description.
Error in the conduct of the ENG 2 OIL LO PR procedure
Airbus pointed out that procedural non-compliance can be a function of understanding the procedure and its meaning. While the supporting explanation for the ENG OIL LO PR alert in the flight crew operations manual (FCOM) was limited, the ECAM was a level 3 alert. A level 3 or red alert denotes a ‘system failure that alters flight safety and requires immediate action’, and the associated procedure is one ‘which may result in personal injury or loss of life if not carefully followed’.
The flight crew’s response, however, should also be assessed in light of the FCTM’s guidance on shutting an engine down as well as the apparent miscomprehension of the ‘persist’ component of the procedure. The FCTM guidance stated that a flight crew should keep the engine operating unless the procedure required an engine shutdown. Instead of shutting the engine down, the flight crew mistakenly continued monitoring the condition of engine 2.
While monitoring the engine, the flight crew developed a belief that the zero oil pressure readings were due to a gauge error, as all other engine parameters were interpreted as being normal. This led to a mistaken understanding that the alert was a false indication. The flight crew subsequently increased engine 2’s thrust. The increase in thrust resulted in the engine surging then stalling, which triggered the ENG 2 STALL alert. The flight crew correctly actioned this new ECAM procedure, retarding the thrust lever. About 30 seconds later, however, the engine surged again and failed. The engine failing triggered the ENG 2 FAIL alert.
The flight crew’s action of increasing the thrust on engine 2 following the ENG 2 OIL LO PR alert led to the engine stalling and then failing. While it is likely that continued operation of the engine at idle thrust with zero oil pressure would have eventually resulted in sufficient bearing damage that would lead to stalls and engine failure, the increase in thrust accelerated that result and probably increased the damage to the engine.
Finally, as identified by the United Kingdom’s Air Accidents Investigation Branch bulletin 9/2013, there had been a recent series of false low oil pressure ECAM alerts on Rolls-Royce engine A330 aircraft. It is not known whether the flight crew were aware of these, or any subsequent Airbus notifications concerning this issue.
Attempted relights of the failed engine
The flight crew stated that, following the engine failure, they then actioned the ENG 2 FAIL procedure. That procedure included a restart (relight) attempt, and then required a decision about whether the engine was damaged. If the engine was deemed not to be damaged, the procedure prompted the flight crew to consider a further relight attempt. The conditions indicative of engine damage were contained within the FCOM, but not displayed on the ECAM. The flight crew later reported that they consulted the FCOM, following which they determined that the engine was not damaged. This may have been influenced by confirmation bias as there was evidence available to the crew that met the criteria for damage, including:
- repeated or uncontrollable engine stalls—the engine experienced two surges/stalls. The ENG 2 STALL message alerted the flight crew to the first stall, to which they responded in accordance with the required procedure, and from which the engine recovered. The second stall was uncontrolled and resulted in the engine failing. The flight crew stated that they did not recognise the two stall events as being separate, however, they stated that they briefly saw the engine stall message and that the engine then failed. By this account, the engine experienced an uncontrollable engine stall.
- abnormal engine indications—the guidance cites hydraulic fluid loss, no N2 or N3 indication as examples. The flight crew stated that they assessed the oil pressure indications as being abnormal after they had completed the initial actions for the ENG OIL LO PR alert.
Further, the FCOM version of the ENG 2 FAIL procedure stated, at the end of the procedure, that the engine should be shut down if a relight was unsuccessful.
About 10 minutes after shutting the engine down, the flight crew attempted to relight the engine. Prior to this attempt, the first following the engine shut down, there were a number of factors that should have alerted the flight crew that there was a problem with engine 2 and not to attempt to a relight:
- The engine had an oil pressure issue, the source of which had not been clarified.
- The engine had experienced a stall, then failed.
- The engine indications were not normal.
- The operations manual stated that a failed engine should not be restarted if the reason for the failure cannot be clearly identified.
- The relight attempt that was part of the ENG FAIL procedure was not successful, and the subsequent ENG SHUT DOWN procedure did not include the option of a relight.
There was no safety risk to the aircraft that demanded a relight attempt, and there was significant contextual evidence related to that engine that should have created doubt about the engine. The flight crew’s first relight attempt was unsuccessful. They stated that the relight procedure was terminated as a result of the ENG 2 START FAULT message being displayed.
The flight crew commenced a second relight attempt of the failed engine just prior to descending into Melbourne. In addition to the above factors that should have alerted the flight crew that there was a problem with engine 2, there was the additional evidence of a failed restart attempt and the ENG 2 START FAULT message. For this restart, the engagement of the starter motor and the airspeed were sufficient to enable fuel to be introduced into the engine. While the engine relit shortly after, the Rolls-Royce report identified that either prior to, or as a result of, the second relight attempt, the damage to the engine was sufficient to ensure that the engine could not achieve rpm for sustained operation. Further, the flight crew stated that the relight attempt was terminated as a result of vibrations from the engine.
The multiple failures and abnormal indications associated with engine 2 should have raised questions for the flight crew on the cause of the failures and the viability of the engine. The operator’s procedures stated that, if the reason for an engine failure cannot be clearly identified, then it shall not be restarted unless a greater emergency exists. A greater emergency did not exist. Contrary to the operator’s procedures, the flight crew attempted two restarts on the failed engine.
A factual synopsis
Following the first engine surge the flight crew declared an emergency and notified ATC of the event, requested descent, and indicated an initial intention to divert to Melbourne. ATC cleared the aircraft to divert to Melbourne and descend. Following the second surge and subsequent engine failure, the flight crew confirmed the intent to divert to Melbourne. A few minutes later, the flight crew contacted maintenance support, who stated a preference for Melbourne as the diversion target due to better support, but stated that the decision was the aircraft captain’s. Shortly after, the flight crew requested advice on the runway in use in Adelaide, and whether the Adelaide curfew was in force. ATC responded with the runway, and that the curfew would be waived if the flight crew declared an emergency. The flight crew did not respond to this advice.
From the flight crew’s perspective, the following considerations influenced the diversion decision:
- The minimum RFFS requirement for a diversion airport was CAT 7.
- Adelaide was not preferred due to the curfew and RFFS status being below the required CAT 7 and the flight crew being more familiar with Melbourne.
- Melbourne provided the best option due to RFFS and maintenance support.
- Due to the close proximity of Adelaide to the intended diversion route, it offered a diversion option should a subsequent emergency require an immediate diversion.
- The final decision to divert to Melbourne could be delayed until an equal time point between Melbourne and Adelaide.
- The flight crew’s stated belief that the emergency was controlled, which formed the basis for not using Alice Springs.
From an inflight procedural perspective, the diversion following an engine failure required consideration of a number of criteria:
- The flight was ETOPS, but the operator advised that ETOPS policy and procedure did not apply, as the aircraft had not yet entered an ETOPS segment.
- The OMA required response to an engine failure during normal (non-ETOPS) operations was to divert to the nearest suitable aerodrome. The criteria for a suitable aerodrome included a minimum RFFS of CAT 7.
- The OMA required response to a LAND ASAP ECAM message, which was part of the ETOPS procedural section, was to land at the nearest ETOPS suitable aerodrome. The ETOPS suitable aerodrome criteria required a minimum RFFS of CAT 4.
- Alice Springs, which was nominated in the operational flight plan (OFP) as an en route alternate, had no RFFS capability during the required period of use, but the operator stated that it meet the adequate aerodrome requirements through municipal fire services that were located off-aerodrome.
- Adelaide, which was not nominated as an en route alternate but was listed as a company preferred alternate, had RFFS CAT 5 with the capacity to be CAT 9 with one hour’s notice. Adelaide also had a curfew operating during the flight, but an in-flight emergency was sufficient to override those curfew restrictions.
- At the time of the engine failure, the aircraft was within the 60‑minute zone from a company approved adequate en route aerodrome—Alice Springs—and had not entered and ETOPS segment.
- OMA stated criteria for determination of whether an aerodrome was suitable for diversion did not include commercial criteria such as availability of maintenance.
- The Airbus FCOM definition for an amber LAND ASAP included that the flight crew consider landing at the nearest suitable aerodrome.
- At no point during the diversion did the aircraft exit the 60-minute zone from a company approved alternate.
‘ETOPS’/‘Flight’ policy and procedures
The OMA ETOPS policies and procedures section contained statements indicating they applied to the diversion:
The section commenced with a statement that the ETOPS section applied over and above standard operations policy when operating on specified ETOPS routes. The aircraft was operating on an ETOPS route that was authorised by the Malaysian regulator.
The ETOPS definition section stated that ETOPS operations apply to all twin-engine aircraft over a route that contains a point further than 60 minutes flying time from an adequate airport. The aircraft was flying on a route that met this description.
Further, the structure of the ETOPS section—in that it contained policies and procedures that specifically required certain preflight checks, immediate actions post departure, and actions before entry into an ETOPS segment—indicated that these policies and procedures were applicable outside of the ETOPS segment. The ETOPS section contained a requirement to divert to the nearest suitable airport when a LAND ASAP message was displayed. The ETOPS section contained specific criteria for this type of airport. Those criteria included a minimum RFFS of CAT 4. Therefore, under these policies and procedures, the available diversion airports were Alice Springs, Adelaide and Melbourne.
The operator stated that the OMA ETOPS policies and procedures did not apply for the diversion, as the aircraft had not entered an ETOPS segment. Therefore, policies and procedures contained within the OMA ‘Flight procedures’ section applied. Those policies and procedures specifically required diversion to the nearest suitable aerodrome following an engine failure. An engine failure will result in a LAND ASAP message if the engine is not restarted. The criteria for a suitable aerodrome was determined by the ‘Flight planning’ section, which stipulated a minimum RFFS of CAT 7. The operator stated that this could be reduced to CAT 6 on authorisation of the Flight Operations Director. This was the case with Alice Springs. Therefore, under these policies and procedures, the available diversion airports were Alice Springs and Melbourne.
The flight crew stated that the ‘Flight procedures’ minimum RFFS requirement of CAT 7 was the basis of the decision to not use Adelaide. The effect was to limit the airports available for a diversion. That is, a variation in required RFFS CAT for an en route alternate airport available to be used for diversion was based on whether or not the aircraft had passed a specific point in the flight plan—a point that had only a marginal relationship to the actual flight conditions. If an emergency condition enables the flight crew to use an aerodrome with a lower RFFS CAT after passing this point, there is no reason the same RFFS CAT cannot be used before passing that point. The difference is not based in safety.
The safety aspect of the diversion to Melbourne
Both the normal and the ETOPS procedures in the OMA required a diversion to the nearest suitable aerodrome following an engine failure. The captain stated that the decision to bypass Adelaide for Melbourne was based on Adelaide not having the required RFFS, but there was also a commercial advantage cited. Alice Springs was not considered as the emergency was controlled.
Regarding the controlled emergency, shortly after the engine failed the flight crew attempted a restart. That restart failed. The aircraft was then limited to operating on a single engine. While it is argued that ETOPS did not apply in this instance, the development of ETOPS is instructive in considering the safety effect of the loss of an engine on twin turbine-engine aircraft. The increased and increasing reliability of turbine engines has resulted in the capacity of this aircraft type to fly further from an en route airport that is available for use in the event of an engine failure—that is the increasing ETOPS range of these aircraft. The basis for certifying this increasing ETOPS range is the likelihood of a catastrophic result from engine failure—that is the likelihood of a double engine failure. The statistical probability of a double engine failure is materially less than the probability assigned to a catastrophic accident. The point to note, however, is that following an engine failure, the risk to the aircraft is the statistical likelihood of the remaining engine failing. This is an elevated risk in comparison to normal operations. While certification identifies that single engine operations can be achieved out to the limit of the ETOPS range, safety would indicate that the earliest landing is the safest option. As stated in the operator’s OMA, in all cases involving an engine failure the requirement is to land at the nearest suitable airport. Other risk factors can affect this decision; however, commercial considerations are not included.
The nearest suitable airport at the time of the engine failure was Alice Springs. The operator’s internal investigation stated that the diversion did not meet the policy and procedures requirement to divert to the nearest suitable alternate. Communications records and flight crew statement indicate that the flight crew intended to divert to Melbourne from the initiation of the diversion. Alice Springs was available and it nominated in the OFP as an en route alternate airport. Adelaide was a company preferred alternate, and while it was subject to curfew restrictions and RFFS limitations these were not an impediment to its use as a diversion airport. In conclusion, the diversion to Melbourne resulted in an increase in the time that the flight was exposed to the higher risk environment of single engine operations.
From the evidence available, the following findings are made with respect to the engine failure and diversion of an Airbus A330, registered 9M-XXD that occurred 445 km south-east of Alice Springs, South Australia on 16 August 2016. These findings should not be read as apportioning blame or liability to any particular organisation or individual.
- In response to an engine oil low pressure (ENG OIL LO PR) ECAM, resulting from a fractured shaft within the oil pressure pump, the flight crew continued to monitor the engine parameters instead of shutting the engine down. Due to a mistaken understanding that the alert was a false indication, the flight crew subsequently increased thrust.
- The Airbus A330 engine oil low pressure (ENG OIL LO PR) abnormal procedure included the conditional instruction 'if the condition persists'. This may be interpreted as either requiring the flight crew wait a certain period of time to determine the continuation of the condition, as apparently interpreted by the flight crew, or, as intended by Airbus, that the condition has not changed as a result of the previous procedural step.
- Contrary to operating procedures, the flight crew made two attempts to relight the failed engine.
- The crew diverted to Melbourne instead of the nearest suitable aerodrome. This increased the time that the flight was exposed to the higher risk environment of single engine operations.
Additional safety actions
Whether or not the ATSB identifies safety issues in the course of an investigation, relevant organisations may proactively initiate safety action in order to reduce their safety risk. The ATSB has been advised of the following proactive safety action in response to this occurrence
Training and guidance
As part of a flight safety notice to flight crew, the operator, AirAsia X, emphasised the general rule regarding restarts following an engine failure. The rule, contained within Operations Manual, stated that a failed engine should not be restarted when the cause of that failure is unknown or there is insufficient information to determine the cause. That flight safety notice also reiterated the requirement to land at the nearest suitable airport when a LAND ASAP notification is displayed.
The operator also developed a training package to be presented at the flight crews’ annual base check classroom review. That package, based on the occurrence, was developed to identify lessons learnt from the oil pressure loss and subsequent engine failure, the engine restarts and the diversion decision making.
Air Transport (Aeroplane) Licence, issued August 2007
Pilot in command, A330
Approximately 8,700 hours
Last flight review:
First officer details
Air Transport (Aeroplane) Licence, issued September 2014
Approximately 3,265 hours
Last flight review:
Sources and submissions
Sources of information
The sources of information during the investigation included:
- the flight crew
- AirAsia X
- Airbus Industrie
- Civil Aviation Safety Authority (CASA)
- Civil Aviation Authority Malaysia (CAAM)
- Airservices Australia
- Air Accident Investigation Bureau, Malaysia (AAIB MY)
- Bureau d'Enquêtes et d'Analyses pour la Sécurité de l'Aviation Civile (BEA)
- Air Accident Investigations Branch, United Kingdom (AAIB UK).
AeroSpace, A.S.D., 2013. Defence Industries Association of Europe. Simplified Technical English, Specification ASD-STE100, (6).
Airbus. 2007, ‘Compliance to Operational Procedures – Why do well trained and experienced pilots not always follow procedures?’, Safety First, Issue #05 December 2007, pp. 20-23.
Burian B.K., Barshi I., and Dismukes K., 2005, The challenge of aviation emergency and abnormal situations, NASA report (213462), Moffett Field.
Civil Aviation Safety Authority (CASA) 2014, SMS for Aviation—A Practical Guide, SMS 4 Safety assurance, 2nd Ed., CASA.
International Civil Aviation Organization (ICAO) 2016, Annex 6 Part I – International Commercial Air Transport - Aeroplane¸ 10th Ed, ICAO, Montreal.
International Civil Aviation Organization (ICAO) 2016, Annex 14 Aerodrome Design and Operations¸ 7th Ed, ICAO, Montreal.
International Civil Aviation Organization (ICAO) 2013, Annex 19 Safety Management¸ 1st Ed, ICAO, Montreal.
International Civil Aviation Organization (ICAO) 2013, Document 9859 Safety Management Manual¸ 3rd Ed, ICAO, Montreal.
International Civil Aviation Organization (ICAO) 2018, Document 9859 Safety Management Manual¸ 4th Ed (advanced unedited), ICAO, Montreal.
Li, Y. and Guldenmund, F.W., 2018. Safety management systems: A broad overview of the literature. Safety Science, 103, pp.94-123.
Orasanu, J., 2010, ‘Flight crew decision-making’, in Kani, B., Helmreich, R., Anca, J. (eds), Crew resource management, 2nd Ed, Academic Press, San Diego.
Reason J., 1990, Human error, Cambridge University Press, Cambridge.
Reason, J.T., Carthey, J. and De Leval, M.R., 2001. Diagnosing “vulnerable system syndrome”: an essential prerequisite to effective risk management. BMJ Quality & Safety, 10(suppl 2), pp.ii21-ii25.
Wickens and Hollands (2000). Engineering psychology and human performance. (3rd edn). New Jersey: Prentice-Hall.
Woods, D.D. and Patterson E.S., 2001, ‘How unexpected events produce an escalation of cognitive and coordinative demands’, in Hancock P.A. and Desmond P.A. (Eds), 2000, Stress, Workload and Fatigue, Lawrence Erlbaum Associates, Hillsdale, NJ.
Under Part 4, Division 2 (Investigation Reports), Section 26 of the Transport Safety Investigation Act 2003 (the Act), the Australian Transport Safety Bureau (ATSB) may provide a draft report, on a confidential basis, to any person whom the ATSB considers appropriate. Section 26 (1) (a) of the Act allows a person receiving a draft report to make submissions to the ATSB about the draft report.
A draft of this report was provided to the flight crew, AirAsia X, Airbus, Rolls-Royce, CASA, CAAM, AAIB MY, AAIB UK, and BEA.
Submissions were received from AirAsia X, CAAM, AAIB MY, Rolls-Royce, Airbus, BEA and CASA. The submissions were reviewed and, where considered appropriate, the text of the report was amended accordingly.
Appendix 1: Malaysian legislation and regulation
The principal legislation covering civil aviation in Malaysia is the Civil Aviation Act 1969 (Malaysia). The act established the position of the Director General of Civil Aviation Malaysia (DGCA) and enabled the Minister to make regulations for the purposes of carrying out the objects and purposes of the act, and for carrying out the Chicago Convention and its Annexes. That regulation is the Civil Aviation Regulations 2016 (Malaysia) (CARM). The act, regulations and other subordinate laws applied extraterritorially to Malaysian aircraft and flight crew. The act and regulations also enabled the DGCA to issue notices, circulars, directions or information as required.
The CARM required:
- an operator to submit an operations manual to the Director General for approval
- EDTO operations to be approved by the Director General, and for the Director General to establish the threshold time for an en route alternate.
The Aeronautical Information Circular (AIC) 09/2000 provided detail in the regulation of ETOPS, and which also applied to EDTO operations. The AIC included:
- a method for approving ETOPS, which referenced ICAO Annex 6
- a requirement that nominated adequate aerodromes be accepted by DCA
- that en route alternate aerodromes required for ETOPS be included within the operational and air traffic services flight plans.
DCA published a number of Directives during 2016. These directives contained rules, standards, requirements and procedures pertaining to air operations. The following directives, which came into effect on 15 April 2016, were applicable:
- Rules of the Air
- specific approvals (SPA).
The rules of the air applied extraterritorially for Malaysian aircraft, ‘to the extent that they do not conflict with the rules published by the State having jurisdiction over the territory overflown’. The rules contained the Annex 6 definition for an alternate aerodrome.
The SPA contained a number of requirements that reflected the content of AIC 09/2000. En route alternate aerodrome requirements were also listed in the SPA, and stated the following:
- An EDTO en route alternate aerodrome shall be considered adequate, if, at the expected time of use, the aerodrome is available and equipped with necessary ancillary services such as air traffic services (“ATS”), sufficient lighting, communications, weather reporting, navigation aids and emergency services and has at least one instrument approach procedure available.
- Prior to conducting an EDTO flight, the operator shall ensure that an EDTO en route alternate aerodrome is available, within either the operator’s approved EDTO, or a diversion time based on the MEL generated serviceability status of the aeroplane, whichever is shorter.
- The operator shall specify any required EDTO en route alternate aerodrome(s) in the operational flight plan and ATS flight plan.
DCA published two other directions on 12 September 2016. While coming into effect shortly after the occurrence, they contain information relevant to current EDTO requirements. These directions were:
- organisation requirements for air operations (ORO)
- commercial air transport (CAT).
The ORO required an operator to obtain a route qualification from DCA, which included that the route have sufficient aerodromes that are properly equipped for the operation. That requirement was expanded by the following:
The operator must ensure that relevant RFFS category must be met for applicable aircraft at the airport of operation. However certain countries adopted its own requirement especially for EDTO operation. Therefore an operator may exercise its requirement whenever applicable if operating into their Flight Information Region (FIR).
Appendix 2: Extended range operations for two-engine turbine aircraft
Prior to the 1980’s, regulation limited twin-engine aircraft to operations no more than 60 minutes range from an aerodrome. The introduction, in the 1980’s, of twin turbine engine aircraft capable of long-range operations resulted in the need to develop a method of extending this range limitation, while at the same time ensuring that an equivalent level of safety as that established by aircraft with more than two engines was attained. The result was a system of engine/airframe certification requirements, established engine reliability, and maintenance and operational procedures that created the necessary level of safety. The system was known as extended range operations, or ETOPS. From an ETOPS certification perspective, manufacturers were required to establish that the risk associated with the loss of both engines from independent causes met the required level of safety. From an operational procedural perspective, the risk associated with single engine operations was minimised through, among other things, the identification of, and procedural requirements attached to, the use of ETOPS specific en route alternate aerodromes. As a result of this system, twin turbine engine aircraft have progressively extended the time that they can operate from an en route alternate aerodrome, the diversion time, to more than 4 hours.
International standards for ETOPS
Standards applicable to ETOPS were contained in ICAO Annex 6 to the Chicago Convention, titled Part I – International Commercial Air Transport - Aeroplanes. The following requirements under these standards were applicable:
- The state of the operator was required to issue an air operator’s certificate (AOC) to an operator. That AOC was to include operations specifications. An ETOPS authorisation, threshold time and maximum diversion time limits were to be stated in those operations specifications.
- The operator was to submit an operations manual to the state of the operator for acceptance, and where required, approval. That operations manual was to:
- detail the required content of the OFP
- specify procedures for ETOPS, including engine failure procedures
- nominate and determine the utilisation of diversion aerodromes
- state the level of rescue and fire fighting services (RFFS) deemed acceptable for operations.
- An operator was to prepare an OFP for all intended flights. In preparing that OFP, the operator was to:
- ascertain that all required ground facilities were adequate for the operation, including that the aerodromes specified in the OFP had adequate RFFS
- identify and state in the OFP the en route alternate aerodromes required for ETOPS.
Annex 6 also contained guidance for assessing what level of RFFS was acceptable for the various operational uses of aerodromes. An Airbus A330 aircraft required a base level RFFS of CAT 9. The recommended RFFS category for an en route alternate aerodrome was two levels below that base level—therefore the A330 required RFFS CAT 7 for an en route alternate aerodrome. Annex 6 also recommend that an ETOPS alternate aerodrome should have a minimum of CAT 4.
Flight Crew Operating Manual (FCOM) guidance on ETOPS
The special operations section of the FCOM contained general guidance on ETOPS, as well as specific ETOPS diversion procedures and profiles. It identified that the system design and the engine reliability met with applicable European Aviation Safety Agency (EASA) guidelines, as set forth in AMC 20-6 rev. 2 (EASA).
With respect to diversion decision making, AMC 20-6 contained the following guidance:
Factors to be considered when deciding upon the appropriate course of action and suitability of an aerodrome for diversion may include but are not limited to:
a. Aircraft configuration/weight/systems status;
b. Wind and weather conditions en route at the diversion altitude;
c. Minimum altitudes en route to the diversion aerodrome;
d. Fuel required for the diversion;
e. Aerodrome condition, terrain, weather and wind;
f. Runways available and runway surface condition;
g. Approach aids and lighting;
h. RFFS* capability at the diversion aerodrome;
i. Facilities for aircraft occupants - disembarkation & shelter;
j. Medical facilities;
k. Pilot’s familiarity with the aerodrome;
l. Information about the aerodrome available to the flight crew.
Contingency procedures should not be interpreted in any way that prejudices the final authority and responsibility of the pilot-in-command for the safe operation of the aeroplane.
Note: for an ETOPS en-route alternate aerodrome, a published RFFS category equivalent to ICAO category 4, available at 30 minutes notice, is acceptable.
|Date:||16 August 2016||Investigation status:||Completed|
|Time:||2310 CST||Investigation level:||Systemic - click for an explanation of investigation levels|
|Location:||Alice Springs 445 km SE||Investigation phase:||Final report: Dissemination|
|State:||South Australia||Occurrence type:||Engine failure or malfunction|
|Release date:||19 December 2019||Occurrence category:||Serious Incident|
|Report status:||Final||Highest injury level:||None|
|Type of operation||Air Transport High Capacity|
|Damage to aircraft||Nil|
|Departure point||Sydney, NSW|
|Destination||Kuala Lumpur, Malaysia|