Release notes v3.13#
The release 3.13 is the result of 4 months of work involving over 350 pull requests and touching all aspects of the engine. In particular, this is the minimal release needed to run the new 2020 Euro-Mediterranean Seismic Hazard Model (ESHM20) and the new GEM China model.
The complete list of changes is listed in the changelog:
https://github.com/gem/oq-engine/blob/engine-3.13/debian/changelog
New features#
We have a brand new conditional spectrum calculator documented here:
https://docs.openquake.org/oq-engine/advanced/conditional-spectrum.html
For the moment, it should still be considered experimental; we welcome help from people wanting to test it. This is a functionality implemented as part of the activities GEM is performing within the METIS project (https://metis-h2020.eu).
The event_based_damage calculator introduced in version 3.12 has been revised, unified with the scenario_damage calculator and documented here:
https://docs.openquake.org/oq-engine/advanced/event-based-damage.html
In particular it is possible to work with discrete damage distributions
by setting the parameter discrete_damage_distribution=true
(the
default is false, which is much more performant). Moreover, a few bugs
were fixed.
We now support four different strategies for computing risk profiles, documented here:
https://github.com/gem/oq-engine/blob/engine-3.13/doc/adv-manual/risk-profiles.rst
Global site model parameters (i.e. the site model parameters defined in the job.ini file) are now used if a column is missing in the site model file, instead of raising an error. Only the site model parameters required by the GMPEs are imported in the datastore, additional columns in the site model file are ignored (previously, they were imported even unnecessarily).
Finally, the engine now supports cross correlation (i.e. inter-period correlation) when generating the ground motion fields. Two cross correlation models are supported at the moment (FullCrossCorrelation and GodaAtkinson2009), more will likely be added in the future. The feature is documented here:
https://docs.openquake.org/oq-engine/advanced/correlation.html
Improvements to the classical calculator#
In order to support the ESHM20 model, which is too large to run on
most hardware, we had to implement a tiling functionality, so that
a classical calculation can be split into chunks of sites (tiles) that
are then run sequentially. This is slower than the usual strategy of
running everything in a single tile and should be used only as last
resort, in out-of-memory situations. Also, you should use a small
number of tiles to reduce the performance penalty. For instance,
for a calculation with 100,000 sites, setting max_sites_per_tile=50000
in the job.ini file will produce two tiles and reduce by half the memory
consumption.
In order to avoid slow tasks in tiling calculations that would completely
kill the performance, enormous efforts was dedicated to improving the task
distribution which is now sensibly better than before, in particular
in the presence of CollapsedPointSources and MultiFaultSources. Slow tasks
are still possible in corner case situations but they can be mitigated
by tweaking the parameters time_per_task
and outs_per_task
.
The internals of the datastore have changed significantly, in particular
for the source_info
dataset; there is now an additional source_data
dataset. They are used internally to understand the content of the tasks
in terms of sources and the calculation time per source.
The memory occupation has been significantly reduced, even without tiling, and calculations that were previously running out of of memory are now progressing smoothly. The reduction depends on the model: for instance the OpenQuake implementation of the USGS model for the United States now requires half the memory than before, while the NRCan model for Canada requires the same memory as before. Some small models could possibly require more memory than before. The memory reduction is particularly sensible in models with few sites and in disaggregation calculations.
The disk space occupation has been significantly reduced by changing the ways the PoEs are stored (not storing the zero values). This was necessary for the ESHM20 model where the required disk space was reduced from 256 GB to 110 GB per calculation. Changing the storage had also a positive effect on performance, since the calculation is not stuck writing the data anymore. Moreover, thanks to the new storage format, it is much simpler to perform manual post-processing of the PoEs by using pandas.
We changed the approach used when computing the hazard curves and the
statistics: it is now several times faster and using a lot less memory.
The improvement has come both from the change in storage (there is less data
to read) and the usage of numba in get_slices
. The effect is dramatic
in the ESHM20 model, much less so in smaller models.
The documentation of the point source gridding approximation has been improved, with a section about how to make the Canada model 26x faster:
https://docs.openquake.org/oq-engine/advanced/point-source-gridding.html
We improved the precision of the pointsource_distance
approximation:
that means the now you can use smaller values for the pointsource_distance
parameter and have better performance without loosing precision.
Finally, the preclassical phase of the classical calculator has been
moved to its own independent calculator. That means than one can
run a preclassical analysis only once and then run different classical
calculations starting from the same sources by using the --hc
option. This is also useful when debugging, in case the preclassical
phase is expensive, i.e. for large models.
Changes to the disaggregation calculator#
There was a big change in the CSV exporters, with the goal of reducing the proliferation of outputs. For instance a calculation with N=2 sites, M=10 intensity measure types, P=5 poes_disagg and R=100 realizations and 7 kinds of disaggregation, previously would generate N * M * P * R * 7 = 70,000 files. Now it only generates N * 7 = 14 files. The trick was to add a column for the IMT, a column for the PoE and a column for the realization index. People parsing the CSV outputs will have to update their scripts.
In the single site case, there is now a warning when a realization does not contribute to the disaggregation. This helps identifying pathological situations.
We fixed a bug in the Mag_Lon_Lat
exporter: the order of
the columns was wrong and the fields mag, lon, lat were actually
containing the values of lon, lat, mag.
We added a command oq info disagg
to print all 7 kinds of supported
disaggregation outputs.
We added a FAQ about how to compute mean disaggregation outputs, since many uses asked, see
https://github.com/gem/oq-engine/blob/engine-3.13/doc/faq-hazard.md#how-can-i-compute-mean-disaggregation-outputs
Finally we removed the long deprecated XML exporters.
Improvements to the event based calculator#
We significantly reduced the slow tasks affecting event based calculations, both for hazard and risk.
We added a check for missing GSIMs in the job_risk.ini file, to avoid confusing error messages in the middle of the computation.
We extended the mag-dependent filtering to event based calculations; before it was honored only in classical calculations.
The custom_site_id
is now exported also by the GMF exporters.
When using the --hc
option the engine was using the site collection
of the parent calculation and ignoring the site collection of the
child calculation: this is now fixed.
There is a new parameter gmf_max_gb
, with default value 0.1,
which is used to decide when to store the avg_gmf
dataset.
We improved the documentation on the rupture sampling mechanism:
https://docs.openquake.org/oq-engine/advanced/rupture-sampling.html
Logic trees#
We improved the support for source specific logic trees (i.e. logic
trees with an applyToSources
for each source) and documented it
here:
https://docs.openquake.org/oq-engine/advanced/sslt.html
The branch IDs in the gsim logic tree file are now ignored and the user can skip them altogether. The change was implemented to fix a subtle bug causing incorrect branch paths to be listed in the output “Realizations” in case of duplicated branch IDs.
We changed the string representation of logic tree paths and
added the commands oq show branches
and oq show rlz:<no>
to allow the user to switch easily from the branch path to the
corresponding source model, source parameters and GMPE. They
are documented here:
https://docs.openquake.org/oq-engine/advanced/logic-trees.html
Due to the changes above, the numbering of the realizations can be different between engine 3.13 and previous versions, and calculations using sampling can produce slightly different averages since different realizations may be chosen internally. This is akin to a change of random seed, i.e. it is not a physically significant change.
There is now a limit on the number of branches per branchset in the logic tree (94 branches).
We changed the experimental feature collapse_gsim_logic_tree
to use
the class AvgPoeGMPE
instead of AvgGMPE
: in this way it is possible to
compute exactly the average mean curves in the case of full enumeration,
even when the number of realizations is too large to use the traditional
approach.
We fixed the extendModel
feature that was not working in presence
of multiple files per source model logic tree branch.
Running a calculation with full enumeration and more than 15,000
realizations now raises an error unless the user raises the
parameters max_potential_paths
in the job.ini file. This is useful
to avoid out-of-memory issue at the end of the calculation, in
the postclassical phase.
hazardlib#
As usual, many new GMPEs were contributed:
Miguel Leonardo-Suárez contributed the GMPEs Arroyo et al. (2010) and Jaimes et al. (2020).
Chiara Mascandala contributed a new GMPE LanzanoEtAl2019_RJB_OMO and fixed a bug in the SgobbaEtAl2020 GMPE. She also contributed new variations of Lanzano & Luzi (2019), Skarlatoudis et al. (2013), and BCHydro GMPEs.
Giuseppina Tusa contributed the Tusa-Langer-Azzaro (2019) GMPE.
Thanks to EDF we added support for the EAS, FAS and DRVT intensity measure types, used in the new GMPEs Bora et al. (2019) and Bayless & Abrahamson (2018).
We added the GMPEs of Bahrampouri et al. (2021) for the Arias intensity measure type.
The
backarc
site parameter used to have only two possible values, 0 for “fore arc” and 1 for “back arc”; now the value 2 for “along arc” is accepted too. The change was made to support the GMPE Manea (2021), contributed by Elena Manea and Laurentiu Danciu.
A few bugs were also fixed:
We fixed a bug in
recompute_mmax
(returning square meters instead of square kilometers) affecting logic trees changing the slipRate.We fixed a compatibility bug with the SMTK causing the error
'RuptureContext' object has no attribute 'occurrence_rate'
when running the SMTK tests.We fixed an array<->scalar bug in the GMPE Abrahamson-Gulerce (2020)
Several essential bug fixes and improvements were made on the new MultiFaultSource and KiteSurface classes, needed for the GEM China model and others.
Riccardo Zaccarelli found a typo in the intensity measure type RSD that was fixed.
There were a few other changes and new features:
We changed
hazardlib.valid.gsim
to return a correctly instantiated GSIM or to fail. Before - for GMPETable subclasses - it was returning a partially initialized GSIM to be post-initialized later on. Thanks to Bruce Worden for pointing this out.We changed the API of
get_mean_stds
: there is no need to specify the standard deviation anymore, since it always returns all three standard deviations ($\sig, \tau, \phi$) on top of the mean ($\mu$). We improved its documentation in the advanced manual.We added a warning when the mesh size of MultiFaultSources and NonParametricSources is over one million points: the goal is to warn the user against building sources that are then too big to compute.
We added a new method
add_between_within_stds
to the modifiable GMPE to compute spatially correlated ground-motion fields even when the initial GMPE only provides the total standard deviation. To be used with care.We added a check to the
geo.Line class
to ensure that every Line object must have at least two points.We improved the error message in ShakeMap calculations failing due to the correlation matrix being not positive defined; now we point out to the manual: https://docs.openquake.org/oq-engine/advanced/shakemaps.html#correlation
Yen-Shin Chen contributed a new scaling relationship Thingbaijam et al. (2017) for strike-slip.
Manuela Villani contributed a new method
horiz_comp_to_geom_mean
to the ModifiableGMPE class to convert ground-motion between different representations of the horizontal component.
Finally there was some refactoring and 15 classes in the module
hazardlib.gsim.boore_2014
have been replaced with a single parametric GMPE,
so they cannot be called directly anymore, but rather via the
valid.gsim
factory function, which is the recommended way for all GMPEs.
Risk fixes and improvements#
We renamed the field “conversion” into “risk_id” in the header of the taxonomy mapping file. The old name is still valid.
We fixed a bug in presence of continuous fragility functions with
minIML
equal to noDamageLimit
, causing damages where there should
have been no damage.
We fixed a bug in event based risk calculations with non-trivial taxonomy mapping producing NaNs in the event loss table.
We added a “reaggregate_by” feature in event based risk calculations, documented here:
https://github.com/gem/oq-engine/blob/engine-3.13/doc/adv-manual/risk-profiles.rst
We improved the error checking when reading the taxonomy mapping file.
We added a check for invalid consequence files, to get an early error and not an error in the middle of the risk calculation. Also, specifying consequence files without fragility files now raises an early . error.
We added a check for missing investigation_time in classical risk calculations.
We fixed a serious performance bug when using ignore_master_seed=true
that caused a 60x slowdown in the event based risk calculation for China.
We implemented secondary perils in the risk side. This is still an experimental feature.
When using the --hc
option there is now a check making sure that the
intensity measure levels are consistent between child calculation
and parent calculation.
We fixed a bug when using aggregate_by = site_id
in presence of
a parent calculation.
The aggregation of losses in event based risk calculations has been optimized even more, and unified with the aggregation in event based calculations.
The damage outputs have been unified with the risk outputs and now we
have only two possibilites, both for risk and damage calculations: an
aggrisk
output and an aggcurves
output. Both are pandas-friendly
to help post-processing of the results. As a consequence, also the
exported CSV files are more similar between risk and damage outputs.
Consequence models in XML format have been deprecated: you should use solely the CSV format.
Other fixed and changes#
We renamed the parameter individual_curves
into individual_rlzs
since it now applies not only to hazard curves but to all kinds of outputs.
The old name is still valid as an alias.
The –log-file option in the command oq engine
was not honored.
We fixed running a calculation starting from a parent calculation owned by a different user.
Mixed list-scalar maximum distances (for instance maximum_distance={ ‘TRT_A’: 100, ‘TRT_B’: [4, 100], [8, 200]}) were previously invalid: this has been fixed.
The custom_site_id
site model parameter is now honored in all
calculators, not only in classical calculators. Moreover we
changed it from being a 32-bit integer to a 6-characters ASCII string
and we documented it:
https://docs.openquake.org/oq-engine/advanced/special-features.html
DataStore.read_df
now returns strings and not bytes for fields stored
as bytes.
readinput.get_composite_source_model(oqparam, branchID)
now accepts a branchID
parameter; this is useful for Hamlet.
datastore.read
now accepts a flag read_parent
; by default it is
True, but it can be set to False to avoid reading the parent of a calculation
(if any). This is useful in postprocessing scripts.
get_oqparam
now does not instantiate a SourceModelLogicTree object
and thus avoids parsing the entire source model every time. For the
Australia model that reduces the instantiation time from 47 seconds to
0.7 seconds. That makes it possible to assess the size of a source model
very quickly.
oq commands#
The logic behind the command oq check_input job_haz.ini job_risk.ini
has changed: now the risk files are checked first, so errors are
discovered immediately and not after a slow check of the hazard files.
The command oq prepare_site_model
has been extended to accept vs30 files
in .csv.gz or .hdf5 format. We also added an utility vs30tohdf5
to
convert the global vs30 .geotiff map provided by the USGS
(downloadable from https://earthquake.usgs.gov/data/vs30/)
into a compressed HF5 file suitable for use with oq prepare_site_model
.
The command oq prepare_site_model
has also been extended to work for exposures
containing assets distant from all site parameters: in that case a warning
is printed. Previously, the site model could not have been generated.
We added a command oq compare uhs
to compare uniform hazard spectra between
two or more calculations, similar to the old commands
oq compare hcurves
and oq compare hmaps
.
We fixed a small bug in oq zip job_haz.ini -r job_risk.ini
: now it works
even for empty oqdata directories.
We fixed oq plot uhs
in presence of non-SA intensity measure types.
We added a command oq plot source_data
.
We added a command oq info cfg
to show the location of the
engine configuration file.
We extended the command oq plot avg_gmf
so it can plot the differences
between two average GMFs.
IT and WebUI changes#
We upgraded a few libraries: numpy to version 1.20, scipy to version 1.7, h5py to version 3.1 and GDAL to version 3.3.3. This allowed us to support Python 3.9. Python 3.7 and 3.8 are also supported, while Python 3.6 is deprecated since it has reached its end of life and it is not supported by python.org. The engine will probably stop working with Python 3.6 in the next release. Python 3.10 is not officially supported yet.
We removed the dependency from python-prctl which was not needed anymore.
We removed all upper bounds on the dependencies to make it possible for people to install the engine with newer libraries. Thanks to that the engine now unofficially works on macOS with an M1 processor by using Python 3.9 and the latest libraries. However, that requires manual tweaking and it is not officially supported by GEM. We cannot support the M1 processor until GitHub provides support for automatic testing on this platform.
Some infrastructure work for dynamically deploying the engine on a cluster of kubernetes has been started, and the WebUI changed to support such an environment. In particular now invalid input files raise an error before spawning a new machine. Also, small calculations are recognized and executed on the master machine, without spawning anything.
The WebUI has been changed to store the input files in the
temporary directory determined by the custom_tmp
parameter in the
file openquake.cfg
. Moreover we added a parameter mosaic_dir
in openquake.cfg
that allows the engine to read (big) files from
a predefined global directory, so that we can avoid uploading huge files
to the WebUI.
We also extended the WebUI to run sensitivity analysis calculations.
An annoying warning about not having the latest engine release has been removed.
We extended the systemd services to work on multiple linux distributions and not only on Ubuntu.
We extended and improved the installation script install.py
in
various ways. For instance now it is mandatory to pass the kind of
installation to perform. Optionally, it is also possible to pass a
port for the DBServer.
When using install.py --version <branch>
the latest commit of a branch is downloaded and stored so that
oq engine --version
prints the git hash.
We removed the multi_user
flag from the file openquake.cfg
:
now an installation is automatically considered to be of kind multi
user if the engine was installed with root permissions. Multi users
installations are meant for Linux servers only.
We removed Celery from requirements files, since it has been deprecated and not used for years.
We added experimental support for ipyparallel and for Ray.