How Risk Assessment Fails

This entry is part 2 of 8 in the series Risk Assess­ment

Fukushima Dai Ichi Power Plant after the explosionsThe events unfold­ing at Japan’s Fukushi­ma Dai Ichi Nuclear Pow­er plant are a case study in ways that the risk assess­ment process can fail or be abused. In an arti­cle pub­lished on, Jason Clen­field item­izes decades of fraud and fail­ures in engi­neer­ing and admin­is­tra­tion that have led to the cat­a­stroph­ic fail­ure of four of six reac­tors at the 40-year-old Fukushi­ma plant. Clenfield’s arti­cle, ‘Dis­as­ter Caps Faked Reports’, goes on to cov­er sim­i­lar fail­ures in the Japan­ese nuclear sec­tor.

Most peo­ple believe that the more seri­ous the pub­lic dan­ger, the more care­ful­ly the risks are con­sid­ered in the design and exe­cu­tion of projects like the Fukushi­ma plant. Clenfield’s arti­cle points to fail­ures by a num­ber of major inter­na­tion­al busi­ness­es involved in the design and man­u­fac­ture of com­po­nents for these reac­tors that may have con­tributed to the cat­a­stro­phe play­ing out in Japan. In some cas­es, the cor­rect actions could have bank­rupt­ed the com­pa­nies involved, so rather than risk finan­cial fail­ure, these fail­ures were cov­ered up and the work­ers involved reward­ed for their efforts. As you will see, some­times the degree of care that we have a right to expect is not the lev­el of care that is used.

How does this relate to the fail­ure and abuse of the risk assess­ment process? Read on!

Risk Assessment Failures

Earthquake and Tsunami damage - Fukushima Dai Ichi Power PlantThe Fukushi­ma Dai Ichi nuclear plant was con­struct­ed in the late 1960’s and ear­ly 1970’s, with Reac­tor #1 going on-line in 1971. The reac­tors at this facil­i­ty use ‘active cool­ing’, requir­ing elec­tri­cal­ly pow­ered cool­ing pumps to run con­tin­u­ous­ly to keep the core tem­per­a­tures in the nor­mal oper­at­ing range. As you will have seen in recent news reports, the plant is locat­ed on the shore, draw­ing water direct­ly from the Pacif­ic Ocean.

Learn more about Boil­ing Water Reac­tors used at Fukushi­ma.

Read IEEE Spectrum’s “24-Hours at Fukushi­ma”, a blow-by-blow account of the first 24 hours of the dis­as­ter.

Japan is locat­ed along one of the most active fault lines in the world, with plate sub­duc­tion rates exceed­ing 90 mm/year. Earth­quakes are so com­mon­place in this area that the Japan­ese peo­ple con­sid­er Japan to be the ‘land of earth­quakes’, start­ing earth­quake safe­ty train­ing in kinder­garten.

Japan is the coun­ty that cre­at­ed the word ‘tsuna­mi’ because the effects of sub-sea earth­quakes often include large waves that swamp the shore­line. These waves affect all coun­tries bor­der­ing the worlds oceans, but are espe­cial­ly preva­lent where strong earth­quakes are fre­quent.

In this envi­ron­ment it would be rea­son­able to expect that con­sid­er­a­tion of earth­quake and tsuna­mi effects would mer­it the high­est con­sid­er­a­tion when assess­ing the risks relat­ed to these haz­ards. Remem­ber­ing that risk is a func­tion of sever­i­ty of con­se­quence and prob­a­bil­i­ty, the risk assessed from earth­quake and tsuna­mi should have been crit­i­cal. Loss of cool­ing can result in the cat­a­stroph­ic over­heat­ing of the reac­tor core, poten­tial­ly lead­ing to a core melt­down.

The Fukushi­ma Dai Ichi plant was designed to with­stand 5.7 m tsuna­mi waves, even though a 6.4 m wave had hit the shore close by 10 years before the plant went on-line. The wave gen­er­at­ed by the recent earth­quake was 7 m. Although the plant was not washed away by the tsuna­mi, the wave cre­at­ed anoth­er prob­lem.

Now con­sid­er that the reac­tors require con­stant forced cool­ing using elec­tri­cal­ly pow­ered pumps. The back­up gen­er­a­tors installed to ensure that cool­ing pumps remain oper­a­tional even if the mains pow­er to the plant is lost, are installed in a base­ment sub­ject to flood­ing. When the tsuna­mi hit the sea­wall and spilled over the top, the flood­wa­ters poured into the back­up gen­er­a­tor room, knock­ing out the diesel back­up gen­er­a­tors. The cool­ing sys­tem stopped. With no pow­er to run the pumps, the reac­tor cores began to over­heat. Although the reac­tors sur­vived the earth­quakes and the tsuna­mi, with­out pow­er to run the pumps the plant was in trou­ble.

Learn more about the acci­dent.

Clear­ly there was a fail­ure of rea­son when assess­ing the risks relat­ed the loss of cool­ing capa­bil­i­ty in these reac­tors. With sys­tems that are mis­sion crit­i­cal in the way that these sys­tems are, mul­ti­ple lev­els of redun­dan­cy beyond a sin­gle back­up sys­tem are often the min­i­mum required.

In anoth­er plant in Japan, a sec­tion of pip­ing car­ry­ing super­heat­ed steam from the reac­tor to the tur­bines rup­tured injur­ing a num­ber of work­ers. The pipe was installed when the plant was new and had nev­er been inspect­ed since instal­la­tion because it was left off the safe­ty inspec­tion check­list. This is an exam­ple of a fail­ure that result­ed from blind­ly fol­low­ing a check­list with­out look­ing at the larg­er pic­ture. There can be no doubt that some­one at the plant noticed that oth­er pipe sec­tions were inspect­ed reg­u­lar­ly, but that this par­tic­u­lar sec­tion was skipped, yet no changes in the process result­ed.

Here again, the risk was not rec­og­nized even though it was clear­ly under­stood with respect to oth­er sec­tions of pipe in the same plant.

In anoth­er sit­u­a­tion at a nuclear plant in Japan, drains inside the con­tain­ment area of a reac­tor were not plugged at the end of the instal­la­tion process. As a result, a small spill of radioac­tive water was released into the sea instead of being prop­er­ly con­tained and cleaned up. The risk was well under­stood, but the con­trol pro­ce­dure for this risk was not imple­ment­ed.

Final­ly, the Kashi­waza­ki Kari­wa plant was con­struct­ed along a major fault line. The design­ers used fig­ures for the max­i­mum seis­mic accel­er­a­tion that were three times low­er than the accel­er­a­tions that could be cre­at­ed by the fault. Reg­u­la­tors per­mit­ted the plant to be built even though the rel­a­tive weak­ness of the design was known.

Failure Modes

I believe that there are a num­ber of rea­sons why these kinds of fail­ures occur.

Peo­ple have a dif­fi­cult time appre­ci­at­ing the mean­ing of prob­a­bil­i­ty. Prob­a­bil­i­ty is a key fac­tor in deter­min­ing the degree of risk from any haz­ard, yet when fig­ures like ‘1 in 1000’ or ‘1 x 10-5 occur­rences per year’ are dis­cussed, it’s hard for peo­ple to tru­ly grasp what these num­bers mean. Like­wise, when more sub­jec­tive scales are used it can be dif­fi­cult to real­ly under­stand what ‘like­ly’ or ‘rarely’ actu­al­ly mean.

Con­se­quent­ly, even in cas­es where the sever­i­ty may be very high, the risk relat­ed to a par­tic­u­lar haz­ard may be neglect­ed because the risk is deemed to be low because the prob­a­bil­i­ty seems to be low.

When prob­a­bil­i­ty is dis­cussed in terms of time, a fig­ure like ‘1 x 10-5 occur­rences per year’ can make the chance of an occur­rence seem dis­tant, and there­fore less of a con­cern.

Most risk assess­ment approach­es deal with haz­ards singly. This is done to sim­pli­fy the assess­ment process, but the prob­lem that can result from this approach is the effect that mul­ti­ple fail­ures can cre­ate, or that cas­cad­ing fail­ures can cre­ate. In a mul­ti­ple fail­ure con­di­tion, sev­er­al pro­tec­tive mea­sures fail simul­ta­ne­ous­ly from a sin­gle cause (some­times called Com­mon Cause Fail­ure). In this case, back-up mea­sures may fail from the same cause, result­ing in no pro­tec­tion from the haz­ard.

In a cas­cad­ing fail­ure, an ini­tial fail­ure is fol­lowed by a series of fail­ures result­ing in the par­tial or com­plete loss of the pro­tec­tive mea­sures, result­ing in par­tial or com­plete expo­sure to the haz­ard. Rea­son­ably fore­see­able com­bi­na­tions of fail­ure modes in mis­sion crit­i­cal sys­tems must be con­sid­ered and the prob­a­bil­i­ty of each esti­mat­ed.

Com­bi­na­tion of haz­ards can result in syn­er­gy between the haz­ards result­ing in a high­er lev­el of sever­i­ty from the com­bi­na­tion than is present from any one of the haz­ards tak­en singly. Rea­son­ably fore­see­able com­bi­na­tions of haz­ards and their poten­tial syn­er­gies must be iden­ti­fied and the risk esti­mat­ed.

Over­sim­pli­fi­ca­tion of the haz­ard iden­ti­fi­ca­tion and analy­sis process­es can result in over­look­ing haz­ards or under­es­ti­mat­ing the risk.

Think­ing about the Fukushi­ma Dai Ichi plant again, the com­bi­na­tion of the effects of the earth­quake on the plant, with the added impact of the tsuna­mi wave, result­ed in the loss of pri­ma­ry pow­er to the plant fol­lowed by the loss of back­up pow­er from the back­up gen­er­a­tors, and the sub­se­quent par­tial melt­downs and explo­sions at the plant. This com­bi­na­tion of earth­quake and tsuna­mi was well known, not some ‘unimag­in­able’ or ‘unfore­see­able’ sit­u­a­tion. When con­duct­ing risk assess­ments, all rea­son­ably fore­see­able com­bi­na­tions of haz­ards must be con­sid­ered.

Abuse and neglect

The risk assess­ment process is sub­ject to abuse and neglect. Risk assess­ment has been used by some as a means to jus­ti­fy expos­ing work­ers and the pub­lic to risks that should not have been per­mit­ted. Skew­ing the results of the risk assess­ment, either by under­es­ti­mat­ing the risk ini­tial­ly, or by over­es­ti­mat­ing the effec­tive­ness and reli­a­bil­i­ty of con­trol mea­sures can lead to this sit­u­a­tion. Deci­sions relat­ing to the ‘tol­er­a­bil­i­ty’ or the ‘accept­abil­i­ty’ of risks when the sever­i­ty of the poten­tial con­se­quences are high should be approached with great cau­tion. In my opin­ion, unless you are per­son­al­ly will­ing to take the risk you are propos­ing to accept, it can­not be con­sid­ered either tol­er­a­ble or accept­able, regard­less of the legal lim­its that may exist.

In the case of the Japan­ese nuclear plants, the oper­a­tors have pub­licly admit­ted to fal­si­fy­ing inspec­tion and repair records, some of which have result­ed in acci­dents and fatal­i­ties.

In 1990, the US Nuclear Reg­u­la­to­ry Com­mis­sion wrote a report on the Fukushi­ma Dai Ichi plant that pre­dict­ed the exact sce­nario that result­ed in the cur­rent cri­sis. These find­ings were shared with the Japan­ese author­i­ties and the oper­a­tors, but no one in a posi­tion of author­i­ty took the find­ings seri­ous­ly enough to do any­thing. Rel­a­tive­ly sim­ple and low-cost pro­tec­tive mea­sures, like increas­ing the height of the pro­tec­tive sea wall along the coast­line and mov­ing the back­up gen­er­a­tors to high ground could have pre­vent­ed a nation­al cat­a­stro­phe and the com­plete loss of the plant.

A Useful Tool

Despite these human fail­ings, I believe that risk assess­ment is an impor­tant tool. Increas­ing­ly sophis­ti­cat­ed tech­nol­o­gy has ren­dered ‘com­mon sense’ use­less in many cas­es, because peo­ple do not have the exper­tise to have any com­mon sense about the haz­ards relat­ed to these tech­nolo­gies.

Where haz­ards are well under­stood, they should be con­trolled with the sim­plest, most direct and effec­tive mea­sures avail­able. In many cas­es this can be done by the peo­ple who first iden­ti­fy the haz­ard.

Where haz­ards are not well under­stood, bring­ing in experts with the knowl­edge to assess the risk and imple­ment appro­pri­ate pro­tec­tive mea­sures is the right approach.

The com­mon aspect in all of this is the iden­ti­fi­ca­tion of haz­ards and the appli­ca­tion of some sort of con­trol mea­sures. Risk assess­ment should not be neglect­ed sim­ply because it is some­times dif­fi­cult, or it can be done poor­ly, or the results neglect­ed or ignored. We need to improve what we do with the results of these efforts, rather than neglect to do them at all.

In the mean time, the Japan­ese, and the world, have some cleanup to do.