Release notes v3.6
==================

This release features several major new features (including completely
revised disaggregation, automatic optimization of duplicated sources,
fast exposure importer and taxonomy mapping) and lots of improvements,
new checks and bug fixes. Nearly 200 pull requests were merged.
For the complete list of changes, see the changelog:
https://github.com/gem/oq-engine/blob/engine-3.6/debian/changelog

Disaggregation
--------------

The most relevant development on the hazard side was the work on the
disaggregation calculator. We changed substantially the business
logic. While in previous versions of the engine we were disaggregating
for all possible realizations in order to compute disaggregation
statistics, in this version we gave up on statistics. Instead, we
disaggregate only for a specific realization. The realization can be
specified by the user with the `rlz_index` parameter in the job.ini
file, or it can be determined automatically by the engine as the
realization closest to the mean curve for the given disaggregation
site.

Moreover, now the disaggregation calculation works like a post-calculator
(i.e. with the ``--hc`` option) and it is able to reuse information
computed in its parent calculation: the net effect is that it is
always faster than the corresponding classical calculation while in
the past it was several times slower. We also fixed a couple of
performance bugs: there was a slow operation ``truncnorm.cdf``
in an inner loop and ruptures outside the integration distance
were not discarded.
Finally, we changed the file names of the disaggregation outputs.

For models with thousands of realizations, the disaggregation can easily be
thousands of times faster than before.

Classical PSHA calculator
-------------------------

The engine is now smart enough to recognize duplicated sources
appearing in different branches of the composite source model and to
avoid redundant computations. Because this optimization is always on,
the flag `optimize_same_id_sources` has been removed, as it has now
been rendered useless. There are several models in the
hazard mosaic with duplicates sources and the new optimization has a
significant impact on those. Moreover the demo `LogicTreeCase2ClassicalPSHA`
has become an order of magnitude faster than before thanks to the reduction
of the duplicated sources.

There was a big improvement in the computation of the statistical hazard
curves which now is not only faster, but uses a lot less memory than
before.
The trick was to consider one site at the time, instead of
a block of sites. As a consequence it is now possible to consider
tens of thousands of realizations for hundreds of thousands of sites
without requiring terabytes of RAM. Moreover the data transfer has been
reduced by storing some auxiliary information in the datastore and reading
it from the workers instead of transferring it via celery/rabbitmq.

There was a substantial change in the way the tasks are distributed for a
classical calculation. The engine has acquired the ability to estimate the
runtime of each task and if the estimated time exceeds a `task_duration`
parameter, the engine is able to split the task in subtasks that run
in less than `task_duration` seconds. The user can set the `task_duration`
manually in the `job.ini`, or the user can leave it empty; in that case the
engine will figure out a reasonable value for it.

The approach is not perfect since there are non-splittable sources, so
there is a minimum size for a given subtask and sometimes subtasks taking
much longer that the `task_duration` parameter may still appear: however,
the new approach is a drastic improvement and the situation was never better
than it is now.

We added a check on sources with a suspiciously large spatial extent
(more than 5,000 km) so that a warning is printed. Usually this means
that there was a bug in the generation of the source model.

We added a check on sources with suspicious hypo-depths and nodal plane
distributions (i.e. distributions with duplicated values) since they
make the calculation slower.

In extra-large models saving some debugging information (eg. the number of sites
affected by each source) was exceedingly slow, so now the information is
stored only if there are fewer than 100,000 relevant sources.

Logic trees
-----------

There was a tricky bug with the `minimum_distance` feature
in presence of multiple GSIMs in a logic tree branchset. Now
each GSIMs keeps its own minimum distance; before they were all
getting the same minimum distance, causing wrong results to be computed.
Fortunaly the `minimum_distance` feature is rarely used (and only for
internal purposes) so the bug is minor. The feature is documented here:
https://github.com/gem/oq-engine/blob/engine-3.6/doc/adv-manual/special-features.rst#gmpe-logic-trees-with-minimum_distance

We implemented zero weights for intensity measure types that should be
discarded in the GSIM logic tree. You can see the relevant documentation here:
https://github.com/gem/oq-engine/blob/engine-3.6/doc/adv-manual/special-features.rst#gmpe-logic-trees-with-weighted-imts

We implemented risk logic trees, a.k.a. the *taxonomy mapping* feature.
The idea is that users can map the taxonomy strings in their exposure to one or
more vulnerability/fragility functions, with corresponding weights for each
function assignment, to take into account the epistemic uncertainty in the
exposure âŸ· vulnerability domain. The feature is
documented here:
https://github.com/gem/oq-engine/blob/engine-3.6/doc/adv-manual/risk-features.rst

A big conceptual change (but with no impact on the user) was
the simplification of the source model logic tree XML file. Before it
was necessary to specify a `logicTreeBranchingLevel` node that was
not used internally, now that node is optional. Old
files will keep working, as long as the `logicTreeBranchingLevel`
contains only a single subnode. The case of multiple subnodes is now
correctly flagged as an error. Thanks to the change,
source model logic trees, gsim logic tree, and risk logic trees
are now stored in the same way internally.

Lastly, we fixed a bug in source model logic trees with the options
`applyToSources` and `applyToBranches` on; in some times a fake error
about the source not being in the source model - even if it actually was -
was raised.

Event based hazard
------------------

We introduced a parameter `max_sites_per_gmf` in the job.ini
(only for `event_based` calculations that are trying to store
ground motion fields), with a default of 65,536 sites.
A user trying to run an `event_based` calculation that has
`ground_motion_fields = true`, with more than the number of
sites permitted by `max_sites_per_gmf` will now get an early
validation error instead of running out of memory after
several hours of calculations. The `max_sites_per_gmf` limit can
be raised beyond the default of 65,536 sites, at the user's own
responsibility.

We also added a limit of ``2**32`` events in event based calculations: this is
a hard limit that cannot be raised. If your calculation produces more than
4 billion events, it will need to be be split into smaller calculations.
Such calculations involving billions of ruptures would likely never work anyway,
because it would eventually run out of memory.

We added a check for missing ``intensity_measure_types``: this avoids
cryptic errors in the middle of the computation of the ground motion fields.

We optimized the rupture sampling procedure for point sources (which
includes multi point sources and area sources). The improvement can be
quite significant, for instance the generation of ruptures for a large
multipoint source for Colombia became 30x faster using 12x less memory.

We changed the way ruptures are stored internally: the
`code` field in the `ruptures` dataset is now a unique checksum
depending on the kind of rupture. Before it was an incremental
number depending on the order of the Python module imports which
was making debugging difficult.

The rupture CSV exporter has been enhanced, and now it exports the rupture
surface boundaries as 3D multipolygons instead of 2D multipolygons.

We fixed a small bug in the rupture XML exporter, which was failing
if the user did not specify the hazard sites.

We added the ability to generate hazard curves without storing the GMFs,
simply by setting the flags
```
hazard_curves_from_gmfs = true
ground_motion_fields = false
```
This is useful when one is interested in the hazard curves generated
by an `event_based` calculation but not in the ground motion fields
themselves. Not storing the GMFs reduces the data transfer and the
memory occupation.

In engine 3.5 we changed the `gmf_data` CSV exporter to export a file
``sitemodel.csv`` instead of the file ``sitemesh.csv``. That change has
been reverted because it was generating confusion. The right way to
to export the site model information for the most recently completed
calculation - which works for all calculators,
not only for event based - is to use the command
``oq show sitecol > sitecol.csv``

Importing GMFs from CSV has been enhanced and now it does not require
anymore the field `rlzi`: previously, this was a required field,
but it was assumed to contain always the value `0`. On the other hand,
now the GMF exporter to CSV does not export the field `rlzi`, because
it is redundant: the association between events and realizations can
be found in the events table and it is exported in the
file `sigma_epsilon.csv`.

In the `sigma_epsilon.csv` file, we renamed the field `eid` to
`event_id` in order to avoid confusion with the naming used in the
`gmf_data.csv` file (`event_id` is the 64 bit event ID in the `events`
table in the datastore, `eid` is the 32 bit index to the event ID
record).

Event based risk
----------------

There was a huge refactoring of all risk calculators. As a consequence
the `event_based_risk` calculator has become simpler and faster than before
(twice as fast in some cases).

In the `ebrisk` calculator it is now possible to aggregate by `asset_id`
and therefore to produce individual loss curves and maps for each asset.
Needless to mention, this is only viable for exposures of manageable size.

There was some work to make the ``losses_by_event`` exporter for the
``ebrisk`` calculator more similar to the ones for ``event_based_risk``
and for ``scenario_risk``.

We fixed a bug in the `agg_curves-rlzs` and `agg_curves-stats` outputs
in ``ebrisk``: they were missing the ``units`` compared to the same outputs
coming from the ``event_based_risk`` calculator. This was breaking the QGIS
plugin.

We changed the ``agglosses`` exporter in
``scenario_risk`` calculations, by adding a column with the realization index.

The `agg_curves` exporter for event based risk was broken if the exposure
was imported in the parent calculation and not in the child calculation.

We fixed a bug in the exporter of the aggregate loss curves coming
from an ``ebrisk`` calculation: now the loss ratios are computed
correctly even in presence of occupants. Before the exporter was
writing incorrect loss ratios to the output file.

Hazardlib
---------

Graeme Weatherill (@g-weatherill) contributed a finite rupture option to the
Germany-adjusted Cauzzi and Derras GMPEs. Moreover, he contributed
the Tromans et al. (2019) adjustable GMPE, used for a nuclear
power plant in the UK.

Chris van Houtte (@cvanhoutte) contributed the Van Houtte et al. (2018)
Significant duration model for New Zealand.

Robin Gee (@rcgee) fixed a bug in the GMPE Sharma (2009): there was
a key error if the intensity measure level specified in the job.ini included
periods that required interpolation.

Marco Pagani (@mmpagani) discovered a bug in `calc_hazard_curves`
which was failing with a cryptic
`AttributeError: 'NoneType' object has no attribute 'within_bbox'`
when used in parallel mode. It has been fixed.

Risk
----

The CSV importer for the exposure has been optimized. Before, for
legacy reasons, the importer was converting the CSV records into node
objects similar to the ones coming from the XML importer and then it
was reusing the XML logic. Now we are doing the opposite: the XML
importer is producing records and reusing the logic of the CSV
importer. Thanks to this change for large CSV exposure the new
importer is 4-5 times faster and uses over 10 times less memory than
before.

Since a long time ago the engine has the ability to reduce the hazard
site collection (which can be large, think of a fine grid) only to the
locations where there are assets. Such feature has been optimized in
this release, up to a spectacular extent in some cases: we measured a
speedup from 2h to 0.1s for Canada.

We changed how zipped exposures are managed by the engine. In version
3.5 a zipped exposure was expected to contain an XML file with the
same name of the archive, apart from the extension. Because of that
the `job.ini` file had to contain a line ``exposure_file =
<exposure_path>.xml`` while now it requires a line ``exposure_file =
<exposure_path>.zip``, which is clearer. The change was requested by
the risk team in the context of the CRAVE project because it
simplifies the unzipping of the exposure. Unzipping will overwrite
files of the same name already present, but a warning will be printed
and the original files will be not lost, but renamed with a ``.bak``
extension.

We added a consistency check between statistics for calculations
leveraging the ``--hc`` option, because some users were making mistakes
like trying to compute means in the child calculation without having them
in the parent calculation. Now one gets a clear error message.

We fixed a bug in ``classical_damage`` from CSV with discrete
fragility functions: for hazard intensity measure levels different
from the fragility levels, the engine was giving incorrect results.

Vulnerability functions with the beta distribution must satisfy some
consistency requirements if the coefficients of variation are nonzero.
Unfortunately the consistency check were missing and it was possible
to accept invalid functions raising and error in the middle of the
computation. Now the error will be raised much early, at the time
of the instantiating the vulnerability functions. See #4841 for more
details.

Hyeuk Ryu (@dynaryu) discovered a bug in the `agg_loss-curves` outputs
for the `event_based_risk` calculators, which has been fixed.

Finally there were some improvements to the ``multi_risk`` calculator in the
context of the CRAVE project. In particular now the engine is able
to manage the geometries of volcanic perils like lava, lahar and pyroclastic
flow and it is also able to manage other binary perils without requiring
the introduction of new intensity measure types.

General changes
---------------

The CSV exporters have been enhanced: now there is an additional line
before the header, starting with a ``#`` character, containing some
metadata, like the date when the file the generated, the version of
the engine that generated it, and some relevant parameters, like the
investigation time in the case of the hazard outputs. In the future
we may add even more metadata and extend the approach to other outputs.

Before release 3.3, the engine had the ability to associate site
model parameters to hazard sites on a grid. This feature was sometimes
buggy and removed, by recommending to the users the command `oq
prepare_site_model` instead. `oq prepare_site_model` is able to
produce a `site_model.csv` file with sites on the grid and it performs the
associations explicitly, once and for all.

In this release, we restored the ability to perform the association
directly in the engine. This is less efficient than using `oq
prepare_site_model`, since the same associations will be recomputed during
each run. It is still useful for people wanting to experiment with the
grid spacing: they can run several calculations and when they are
happy with the grid spacing, run `oq prepare_site_model` and fix the
site model once and for all with the preferred grid spacing.

We fixed a performance regression in the `ucerf_classical` calculator,
due to a change of logic in engine 3.5, which was trying to filter
thousands of sources in the controller node instead than in the
workers, thus becoming extra-slow.

We decided to change the `realizations.csv` output for scenario calculations,
by replacing the `branch_path` field with the GSIM representation. This is
more informative for the users and more convenient for the QGIS plugin too.

IT
--

The job queue first introduced in engine-3.5 is now enabled by default.
This means than only on job can run at a given time for a given engine
instance.

The progress report has been improved: before in large classical calculations
the progress started to be printed too late, even days after the start of
the calculation.

We improved the `oq abort` command to remove submitted jobs too.

Deleting a calculation in the engine has always been tricky in the case
of multiple users. In this release we fixed several issues and now an
user can delete all of her calculations with the command `oq reset`.
The engine will look inside the database and correctly remove the
calculations of the user, including all the relevant .hdf5 files.

We improved the `oq plot` command by adding several new kinds of plot.
They are still for internal use only (i.e. introspection and debugging).

We extended the command `oq db` to run generic queries for the
`openquake` user. Other users can only run ``SELECT`` queries.

There was a bug in `oq webui start` not supplying the `--noreload`
argument that has been fixed (the reload functionality of the Django
development server interferes with SIGCHLD and causes zombies).

We fixed another bug with the `--hc` functionality in a multi-user
situation, due to the fact that the engine was searching the the datastore
of the parent calculation in the wrong directory.

There is now a better error message if the shared directory is not mounted.

Source models can now be serialized in TOML format, which is useful for
debugging purposes.

Libraries and Python 3.7
------------------------

In this releases we updated some of our libraries (numpy from version
1.14 to 1.16 and scipy from version 1.0.1 to 1.3.0) to make it
possible to use the engine with Python 3.7. We actually have a cluster
using Python 3.7 in production.

In the future we may distribute installers for Windows and macOS based
on Python 3.7, but for the moment Python 3.6 is still the only
officially supported version and we not plan to abandon Python 3.6 any
time soon.

We raised the minimum version for h5py from 2.8.0 to 2.9.0, fixed
some compatibility issue with Django 2.1 and 2.2 and fixed several
Python 3.7 deprecation warnings. Finally we removed the external
dependency from the mock module since it is included in the standard
library since Python 3.3.

Deprecations/removals
---------------------

For years the engine has been able to import ground motion fields and
hazard curves in CSV format and NRML format, with the NRML format
deprecated. Now finally the NRML importers have been removed.

There was an old deprecated GMF exporter in NRML format for scenario
calculations. It has been finally deprecated. You should use the CSV
exporter thas has been available for years instead.

We deprecated the XML disaggregation exporters in favor
of the CSV exporters.

We removed the long time deprecated `agg_loss_table` exporter since
now all the needed information is in the `losses_by_event` exporter.

We switched officially the testing framework from nosetests to pytest.