Hazard Calculators


Common code for the hazard calculators.

class openquake.engine.calculators.hazard.general.BaseHazardCalculator(job)[source]

Bases: openquake.engine.calculators.base.Calculator

Abstract base class for hazard calculators. Contains a bunch of common functionality, like initialization procedures.

check_limits(input_weight, output_weight)[source]

Compute the total weight of the source model and the expected output size and compare them with the parameters max_input_weight and max_output_weight in openquake.cfg; if the parameters are set


Create records for the hzrdr.lt_realization.

This function works either in random sampling mode (when lt_realization models get the random seed value) or in enumeration mode (when weight values are populated). In both cases we record the logic tree paths for both trees in the lt_realization record, as well as ordinal number of the realization (zero-based).


Inizialize realizations


Optionally generates aggregate curves, hazard maps and uniform_hazard_spectra.


Initialize risk models, site model and sources

save_curves_for_rlz_imt(rlz, imt, imls, points, curves)[source]

Save the curves corresponding to a given realization and IMT.

  • rlz – a LtRealization instance
  • imt – an IMT string
  • imls – the intensity measure levels for the given IMT
  • points – the points associated to the curves
  • curves – the curves
tilepath = ()
exception openquake.engine.calculators.hazard.general.InputWeightLimit[source]

Bases: exceptions.Exception

exception openquake.engine.calculators.hazard.general.OutputWeightLimit[source]

Bases: exceptions.Exception

openquake.engine.calculators.hazard.general.all_equal(obj, value)[source]
  • obj – a numpy array or something else
  • value – a numeric value

a boolean

Classical PSHA Calculator

The Classical Probabilistic Seismic Hazard Analysis (cPSHA) approach allows calculation of hazard curves and hazard maps following the classical integration procedure (Cornell [1968], McGuire [1976]) as formulated by Field et al. [2003].


  • Cornell, C. A. (1968).
    Engineering seismic risk analysis.
    Bulletin of the Seismological Society of America, 58:1583–1606.
  • Field, E. H., Jordan, T. H., and Cornell, C. A. (2003).
    OpenSHA - A developing Community-Modeling
    Environment for Seismic Hazard Analysis. Seism. Res. Lett., 74:406–419.
  • McGuire, K. K. (1976).
    Fortran computer program for seismic risk analysis. Open-File report 76-67,
    United States Department of the Interior, Geological Survey. 102 pages.


Hazard Curves

Hazard Curves are discrete functions which describe probability of ground motion exceedance in a given time frame. Hazard curves are composed of several key elements:

  • Intensity Measure Levels (IMLs) - IMLs define the x-axis values (or “ordinates”) of the curve. IMLs are defined with an Intensity Measure Type (IMT) unit. IMLs are a strictly monotonically increasing sequence.
  • Probabilitites of Exceedance (PoEs) - PoEs define the y-axis values, (or “abscissae”) of the curve. For each node in the curve, the PoE denotes the probability that ground motion will exceedence a given level in a given time span.
  • Intensity Measure Type (IMT) - The unit of measurement for the defined IMLs.
  • Investigation time - The period of time (in years) for an earthquake hazard study. It is important to consider the investigation time when analyzing hazard curve results, because one can logically conclude that, the longer the time span, there is greater probability of ground motion exceeding the given values.
  • Spectral Acceleration (SA) Period - Optional; used only if the IMT is SA.
  • Spectral Acceleration (SA) Damping - Optional; used only if the IMT is SA.
  • Source Model Logic Tree Path (SMLT Path) - The path taken through the calculation’s source model logic tree. Does not apply to statistical curves, since these aggregates are computed over multiple logic tree realizations.
  • GSIM (Ground Shaking Intensity Model) Logic Tree Path (GSIMLT Path) - The path taken through the calculation’s GSIM logic tree. As with the SMLT Path, this does not apply to statistical curves.

For a given calculation, hazard curves are computed for each logic tree realization, each IMT/IML definition, and each geographical point of interest. (In other words: If a calculation specifies 4 logic tree samples, a geometry with 10 points of interest, and 3 IMT/IML definitions, 120 curves will be computed.)

Another way to put it is:

T = R * P * I


  • T is the total number of curves
  • R is the total number of logic tree realizations
  • P is the number of geographical points of interest
  • I is the number of IMT/IML definitions

Hazard curves are grouped by IMT and realization (1 group per IMT per realization). Each group includes 1 curve for each point of interest.

Additionally, for each realization a hazard curve container (with output_type equal to hazard_curve_multi) is created. This container output could be used in contexts where you need to identify a whole group of hazard curves sharing the same realization as when you run a risk calculation supporting structure dependent intensity measure types.

Statistical Curves

The classical hazard calculator is also capable of producing mean and quantile curves. These aggregates are computed from the curves for a given point and IMT over all logic tree realizations.

Similar to hazard curves for individual realizations, statistical hazard curves are grouped by IMT and statistic type. (For quantiles, groups are separated by quantile level.) Each group includes 1 curve for each point of interest.

Mean Curves

Mean hazard curves can be computed by specifying mean_hazard_curves = true in the job configuration.

When computing a mean hazard curve for a given point/IMT, there are two possible approaches:

  1. mean, unweighted
  2. mean, weighted

Technically, both approaches are “weighted”. In the first approach, however, the weights are implicit and are taken into account in the process of logic tree sampling. This approach is used in the case of random Monte-Carlo logic tree sampling. The total of number of logic tree samples is defined by the user with the number_of_logic_tree_samples configuration parameter.

In the second approach, the weights are explicit in the caluclation of the mean. This approach is used in the case of end-branch logic tree enumeration, whereby each possible logic tree path is traversed. (Each logic tree path in this case defines a weight.) The total number of logic tree samples in this case is determined by the total number of possible tree paths. (To perform end-branch enumeration, the user must specify number_of_logic_tree_samples = 0 in the job configuration.

The total number of mean curves calculated is

T = P * I


  • T is the total number of curves
  • P is the number of geographical points of interest
  • I is the number of IMT/IML definitions

Furthermore, also in that case a hazard curve set grouping all the mean curves is produced (of type hazard_curve_multi).

Quantile Curves

Quantile hazard curves can be computed by specifying one or more quantile_hazard_curves values (for example, quantile_hazard_curves = 0.15, 0.85) in the job configuration.

Similar to mean curves, quantiles curves can be produced for a given point/IMT/quantile level (in the range [0.0, 1.0]), there are two possible approaches:

  1. quantile, unweighted
  2. quantile, weighted

As with mean curves, unweighted quantiles are calculated when Monte-Carlo logic tree sampling is used and weighted quantiles are calculated when logic tree end-branch enumeration is used.

The total number of quantile curves calculated is

T = Q * P * I


  • T is the total number of curves
  • Q is the number of quantile levels
  • P is the number of geographical points of interest
  • I is the number of IMT/IML definitions

Moreover, also in that case curves sharing the same quantile are grouped into a virtual output container of type hazard_curve_multi.

Hazard Maps

Hazard maps are geographical meshes of intensity values. Intensity values are extracted from hazard curve functions by interpolating at a given probability exceedance. To put it another way, hazard maps seek to answer the following question: “At the given level probability, what intensity level is likely to be exceeded at a given geographical points in the next X years?”

The resulting geographical mesh is often depicted graphically, with a color key defining which color to plot at the given location for a given value or range values.

Hazard maps bear the same metadata as hazard curves, with the addition of the probability at which the hazard maps were computed.

For a given calulcation, hazard maps are computed for each hazard curve. Maps can be computed for one or more probabilities of exceedance, so the total number of hazard maps is

T = C * E


  • T is the total number of maps
  • C is the total number of hazard curves (see the method for calculating the number of hazard curves)
  • E is the total number of probabilities of exceedance

Note: This includes mean and quantile maps.

Statistical Maps

Hazard maps can be produced from any set of hazard curves, including mean and quantile aggregates. There are no special methods required for computing these maps; the process is the same for all hazard map computation.

Uniform Hazard Spectra

Uniform Hazard Spectra (UHS) are discrete functions which are essentially derived from hazard maps. Thus, hazard map computation is a prerequisite step in producing UHS. UHS derivation isn’t so much a computation, but rather a special arrangement or “view” of hazard map data.

UHS “curves” are composed of a few key elements:

  • Spectra Acceleration Periods - These values make up the x-axis values (“ordinates”) of the curve.
  • Intensity Measure Levels - These values make up the y-axis values (“abscissae”) of the curve.
  • Probability of Exceedance - The hazard map probability value from which the UHS is derived. The “Uniform” in UHS indicates a uniform PoE over all periods.
  • Location - A 2D geographical point, consisting of longitude and latitude.

To construct UHS from a set of hazard maps, one can conceptualize this process as simply extracting from multiple hazard maps all of the intensity measure levels for a given location and arranging values in order of SA period, beginning with the lowest period value. This is done for all locations.

Note: All maps with IMT = SA are considered, in addition to PGA. PGA is equivalent to SA(0.0). Hazard maps with other IMTs (such as PGV or PGD) are ignored.

The example below illustrates extracting the IML values for a given location (indicated by x) from three hazard maps:

Hazard maps PoE: 0.1

   /              /<-- PGA [equivalent to SA(0.0)]
  /     x        /-/
 /--------------/ /<-- SA(0.025)
  /     x        /-/
 /--------------/ /<-- SA(0.1)
  /     x        /-/
 /--------------/ /<-- SA(0.2)
  /     x        /

Assuming that the IMLs from the PGA, SA(0.025), SA(0.1), and SA(0.2) maps are 0.3, 0.5, 0.2, and 0.1, respectively, the resulting UHS curve would look like this:

0.5           *
0.3   *
0.2                   *
0.1                           *
 +----|-------|-------|-------|----> [SA Period]
     0.0    0.025    0.1     0.2

Uniform Hazard Spectra are grouped into result sets where each result set corresponds to a probability of exceedance and either a logic tree realization or statistical aggregate of realizations. Each result set contains a curve for each geographical point of interest in the calculation.

The number of UHS results (each containing curves for all sites) is

Tr = E * (Q + M + R)


  • Tr is the total number of result sets (and also the number of files, if the results are exported)
  • E is the total number of probabilities of exceedance
  • Q is the number of quantile levels
  • M is 1 if the calculation computes mean results, else 0
  • R is the total number of logic tree realizations

The total of UHS curves is

T = Tr * P


  • Tr is the total number of result sets (see above)
  • P is the number of geographical points of interest

Classical PSHA Core

Core functionality for the classical PSHA hazard calculator.

The difficult part of this calculator is the management of the logic tree realizations. Let me explain how it works in a real life case.

We want to perform a SHARE calculation with a single source model with two tectonic region types (Stable Shallow Crust and Active Shallow Crust) and a complex GMPEs logic tree with 7 branching points with the following structure:

Active_Shallow: 4 GMPEs, weights 0.35, 0.35, 0.20, 0.10 Stable_Shallow: 5 GMPEs, weights 0.2, 0.2, 0.2, 0.2, 0.2 Shield: 2 GMPEs, weights 0.5, 0.5 Subduction_Interface: 4 GMPEs, weights 0.2, 0.2, 0.2, 0.4 Subduction_InSlab: 4 GMPEs, weights 0.2, 0.2, 0.2, 0.4 Volcanic: 1 GMPE, weight 1 Deep: 2 GMPEs, weights 0.6, 0.4

The number of realizations generated by this logic tree is 4 * 5 * 2 * 4 * 4 * 1 * 2 = 1280 and at the end we will generate 1280 hazard curve containers. However the independent hazard curve containers are much less, there are actually only 9 of them, 4 coming from the Active Shallow Crust tectonic region type and 5 from the Stable Shallow Crust tectonic region type. The dependent hazard curves (i.e. hazard curves by realization) can be obtained from the independent hazard curves by a composition rule implemented in :method:`openquake.engine.models.HazardCurve.build_data`

NB: notice that 1280 / 9 = 142.22, therefore storing all the hazard curves takes 140+ times more disk space and resources than actually needed. The plan for the future is store only the independent curves and to compose them on-the-fly, when they are looked up for a given realization, since the composition is pretty fast (there are just numpy multiplications), faster than reading from the database all the redundant data. The impact on hazard maps has to be determined.

class openquake.engine.calculators.hazard.classical.core.BoundingBox(lt_model_id, site_id)[source]

Bases: object

A class to store the bounding box in distances, longitudes and magnitudes, given a source model and a site. This is used for disaggregation calculations. The goal is to determine the minimum and maximum distances of the ruptures generated from the model from the site; moreover the maximum and minimum longitudes and magnitudes are stored, by taking in account the international date line.

bins_edges(dist_bin_width, coord_bin_width)[source]

Define bin edges for disaggregation histograms, from the bin data collected from the ruptures.

  • dists – array of distances from the ruptures
  • lons – array of longitudes from the ruptures
  • lats – array of latitudes from the ruptures
  • dist_bin_width – distance_bin_width from job.ini
  • coord_bin_width – coordinate_bin_width from job.ini
update(dists, lons, lats)[source]

Compare the current bounding box with the value in the arrays dists, lons, lats and enlarge it if needed.

  • dists – a sequence of distances
  • lons – a sequence of longitudes
  • lats – a sequence of latitudes

Compare the current bounding box with the given bounding box and enlarge it if needed.

Parameters:bb – an instance of :class: openquake.engine.calculators.hazard.classical.core.BoundingBox
class openquake.engine.calculators.hazard.classical.core.ClassicalHazardCalculator(job)[source]

Bases: openquake.engine.calculators.hazard.general.BaseHazardCalculator

Classical PSHA hazard calculator. Computes hazard curves for a given set of points.

For each realization of the calculation, we randomly sample source models and GMPEs (Ground Motion Prediction Equations) from logic trees.

core_calc_task = <@task: openquake.engine.calculators.hazard.classical.core.compute_hazard_curves of default:0x7fa49b7f8ed0 (v2 compatible)>

Do pre-execution work. At the moment, this work entails: parsing and initializing sources, parsing and initializing the site model (if there is one), parsing vulnerability and exposure files and generating logic tree realizations. (The latter piece basically defines the work to be done in the execute phase.).

Hazard Curves Post-Processing

Post processing functionality for the classical PSHA hazard calculator. E.g. mean and quantile curves.


Compute and save (to the DB) Uniform Hazard Spectra for all hazard maps for the given job.

Parameters:job – Instance of openquake.engine.db.models.OqJob.

Make Uniform Hazard Spectra curves for each location.

It is assumed that the lons and lats for each of the maps are uniform.

Parameters:maps – A sequence of openquake.engine.db.models.HazardMap objects, or equivalent objects with the same fields attributes.
A dict with two values::
  • periods: a list of the SA periods from all of the maps, sorted ascendingly
  • uh_spectra: a list of triples (lon, lat, imls), where imls is a tuple of the IMLs from all maps for each of the periods

Event-Based PSHA Calculator

The Event-Based Probabilistic Seismic Hazard Analysis (ePSHA) approach allows calculation of ground-motion fields from stochastic event sets. Eventually, Classical PSHA results - such as hazard curves - can be obtained by post-processing the set of computed ground-motion fields.



Stochastic Event Sets are collections of ruptures, where each rupture is composed of:

  • magnitude value
  • tectonic region type
  • source type (indicating whether the rupture originated from a point/area source or from a fault source)
  • details of the rupture geometry (including lat/lon/depth coordinates, strike, dip, and rake)

SES results are structured into 3-level hierarchy, consisting of SES “Collections”, SESs, and Ruptures. For each end-branch of a logic tree, 1 SES Collection is produced. For each SES Collection, Stochastic Event Sets are computed in a quantity equal to the calculation parameter ses_per_logic_tree_path. Finally, each SES contains a number of ruptures, but the quantity is more or less random. There are a few factors which determine the production of ruptures:

  • the Magnitude-Frequency Distribution (MFD) each seismic source considered
  • investigation time
  • random seed

From a scientific standpoint, the MFD for each seismic source defines the rupture “occurrence rate”. Combined with the investigation time (specified by the investigation_time parameter), these two factors determine the probability of rupture occurrence, and thus determine how many ruptures will occur in a given calculation scenario.

From a software implementation standpoint, the random seed also affects rupture generation. A “base seed” is specified by the user in the calculation configuration file (using the parameter random_seed). When a calculation runs, the total work is divided into small, independent asynchronous tasks. Given the base seed, additional “task seeds” are generated and passed to each task. Each task then uses this seed to control the random sampling which occurs during the SES calculation. Structuring the calculation in this way guarantees consistent, reproducible results regardless of the operating system, task execution order, or architecture (32-bit or 64-bit).


Event-Based hazard calculations always produce SESs and ruptures. The user can choose to perform additional computation and produce Ground Motion Fields from each computed rupture. (In typical use cases, a user of the event-based hazard calculator will want to compute GMFs.)

GMF results are structured into a hierarchy very similar to SESs, consisting of GMF “Collections”, GMF “Sets”, and GMFs. Each GMF Collection is directly associated with an SES Collection, and thus with a logic tree realization. Each GMF Set is associated with an SES. For each rupture in an SES, 1 GMF is calculated for each IMT (Intensity Measure Type). (IMTs are defined by the config parameter intensity_measure_types.) Finally, each GMF consists of multiple “nodes”, where each node is composed of longitude, latitude, and ground motion values (GMV). The sites (lon/lat) of each GMF are defined by the calculation geometry, which is specified by the region or sites configuration parameters. (In other words, if the calculation geometry consists of 10 points/sites, each computed GMF will include 10 nodes, 1 for each location.)

Hazard Curves

It is possible to produce hazard curves from GMFs (as an alternative to hazard curve calculation method employed in the Classical hazard calculator. The user can activate this calculation option by specifying hazard_curves_from_gmfs = true in the configuration parameters. All hazard curve post-processing options are available as well: mean_hazard_curves, quantile_hazard_curves, and poes (for producing hazard maps).

Hazard curves are computed from GMFs as follows:

  • For each logic tree realization, IMT, and location, there exist a number of hzrdr.gmf records exactly equal to the ses_per_logic_tree_path parameter. Each record contains an array with a number of ground motion values; this number is determined by the number of ruptures in a given stochastic event (which is random–see the section “SESs” above). All of these lists of GMVs are flattened into a single list of GMVs (the size of which is unknown, due the random element mentioned above).
  • With this list of GMVs, a list of IMLs (Intensity Measure Levels) for the given IMT (defined in the configuration file as intensity_measure_types_and_levels), investigation_time, and “duration” (computed as investigation_time * ses_per_logic_tree_path), we compute the PoEs (Probabilities of Exceedance). See openquake.engine.calculators.hazard.event_based.post_processing.gmvs_to_haz_curve() for implementation details.
  • The PoEs make up the “ordinates” (y-axis values) of the produced hazard curve. The IMLs define the “abscissae” (x-axis values).

As with the Classical calculator, it is possible to produce mean and quantile statistical aggregates of curve results.

Hazard Maps

The Event-Based Hazard calculator is capable of producing hazard maps for each logic tree realization, as well as mean and quantile aggregates. This method of extracting maps from hazard curves is identical to the Classical calculator.

See hazard maps for more information.

Event-Based Core

Core calculator functionality for computing stochastic event sets and ground motion fields using the ‘event-based’ method.

Stochastic events sets (which can be thought of as collections of ruptures) are computed given a set of seismic sources and investigation time span (in years).

For more information on computing stochastic event sets, see openquake.hazardlib.calc.stochastic.

One can optionally compute a ground motion field (GMF) given a rupture, a site collection (which is a collection of geographical points with associated soil parameters), and a ground shaking intensity model (GSIM).

For more information on computing ground motion fields, see openquake.hazardlib.calc.gmf.

class openquake.engine.calculators.hazard.event_based.core.EBCalculator(job)[source]

Bases: openquake.calculators.event_based.EventBasedCalculator

Event Based hazard calculator.


Scenario Calculator

The Scenario Siesmic Hazard Analysis (SSHA) approach allows calculation of ground motion fields from a single earthquake rupture scenario taking into account ground-motion aleatory variability.

Disaggregation Calculator

The Disaggregation approach allows calculating relative contribution to a seismic hazard level. Contributions are defined in terms of latitude, longitude, magnitude, distance, epsilon, and tectonic region type.


  • Disaggregation of Seismic Hazard
    by Paolo Bazzurro and C. Allin Cornell
    Bulletin of the Seismological Society of America, 89, 2, pp. 501-520, April 1999


Hazard Curves

Hazard curve calculation is the first phase in a Disaggregation calculation. This phase computes the hazard for a given location by aggregation contributions from all relevant seismic sources in a given model. (The method for computing these curves is exactly the same as the Classical approach.)

Mean and quantile post-processing options for hazard curves are not enabled for the Disaggregation calculator.

Disaggregation Matrices

Once hazard curves are computed for all sites and logic tree realizations, the second phase (disaggregation) begins. While the hazard curve calculation phase is concerned with aggregating the hazard contributions from all sources, the disaggregation phase seeks to quantify the contributions from the various ruptures generated by the source model to the hazard level at a given probability of exceedance (for a given geographical point) in terms of:

  • Longitude
  • Latitude
  • Magnitude (in Mw, or “Moment Magnitude”)
  • Distance (in km)
  • Epsilon (the difference in terms of standard deviations between IML to be disaggregated and the mean value predicted by the GMPE)
  • Tectonic Region Type

This analysis, which operates on a single geographical point and all seismic sources for a given logic tree realization, results in a matrix of 6 dimensions. Each axis is divided into multiple bins, the size and quantity of which are determined the calculation inputs. Longitude and latitude bins are determined by the coordinate_bin_width calculation parameter, in units of decimal degrees. Magnitude bins are determined given the mag_bin_width. Distance bins are determined by the distance_bin_width, in units of kilometers. num_epsilon_bins defines the quantity of epsilon bins. truncation_level is taken into account when computing the width of each epsilon bin, and so this is a required parameter. The number of tectonic region type bins is simply determined by the variety of tectonic regions specified in a given seismic source model. (For instance, if a source model defines sources for “Active Shallow Crust” and “Volcanic”, this will result in two bins.)

The final results of a disaggregation calculation are various sub-matrices extracted from the 6-dimensional matrix. These sub-matrices include common combinations of terms, which are as follows:

  • Magnitude
  • Distance
  • Tectonic Region Type
  • Magnitude, Distance, and Epsilon
  • Longitude and Latitude
  • Magnitude, Longitude, and Latitude
  • Longitude, Latitude, and Tectonic Region Type

Each disaggregation result produced by the calculator includes all of these.

The total number of disaggregation results produce by the calculator is

T = E * R * I * P


  • T is the total number of disaggregation results
  • E is the total number of probabilities of exceedance (defined by poes_disagg)
  • R is the total number of logic tree realizations
  • I is the number of IMT/IML definitions
  • P is the number of points with non-zero hazard (see note below)

Note: In order to not waste computation time and storage, if the hazard curve used to a compute disaggregation for a given point and IMT contains all zero probabilities, we do not compute a disagg. matrix for that point.