Release notes v3.17 =================== Version 3.17 is the culmination of 5 months of work involving around 470 pull requests. It is aimed to users wanting the latest features and maximum performance. Users valuing stability may want to stay with the LTS release instead (currently at version 3.16.4). The complete set of changes is listed in the changelog: https://github.com/gem/oq-engine/blob/engine-3.17/debian/changelog A summary is given below. ## Disaggregation calculator The disaggregation calculator has been deeply revised, by changing the underlying algorithms and by making it possible to compute the average disaggregation directly. For the first time it is possible to compute the mean disaggregation for large models like Europe (with 10,000 realizations) without running out of memory and in a surprisingly small amount of time (minutes on a cluster). The game changer was computing the mean disaggregation in terms the rates and not probabilities, since rates are additive and therefore the mean calculation can be parallelized cleanly without the need to keep all realizations in memory. This is why now the only statistics supported in disaggregation is the mean and you cannot compute the quantiles. Computing the standard deviation is possible with a memory-efficient algorithm so that could be added in the future. The changes have touched all kinds of disaggregation, including epsilon_star disaggregation, disaggregation over multiple sites and disaggregation by source. The disaggregation outputs are stored in a different way and the exporters have been updated and optimized. The CSV outputs have an additional column `iml`. We renamed the "by TRT" disaggregation outputs and added a new output `TRT_Mag`. Models with mutually exclusive sources (i.e. Japan) can now be disaggregated, as well as models using the NegativeBinomialTOM temporal occurrence model (i.e. the New Zealand model). The disaggregation using the `poes_disagg` parameter has been unified with the disaggregation based on the `iml_disagg` parameter. We fixed a few bugs (for instance the disaggregation sometimes failed when using the NGAEast GMPEs) and changed the binning algorithm for LonLat bins. Since the algorithms have changed you will get (slightly) different results. Also, now the default behavior is to compute the mean disaggregation while before it was to compute the disaggregation only for the realization closest to the mean hazard curve. Moreover, we substantially changed the `disagg_by_src` feature. The pandas dataframe stored under the name `disagg_by_src` has been replaced by an `ArrayWrapper` called `mean_rates_by_src` and the documentation changed accordingly. The new structure contains less information than before (only the means) and in a different form (rates instead of PoEs), however it is enough for the purpose of finding the most relevant sources and it can be actually stored for all hazard models in the mosaic, since it requires a lot less storage (like 10,000 times less storage for the EUR model). The AELO project required the ability to store disaggregation results by source. This is why we added a new output `mean_disagg_by_src` with its own CSV exporter. The AELO project als required a deep refactoring of the logic tree processor, to implement the ability to reduce a full logic tree to a specific source. That was hard and time-consuming and may require further work in the future. Finally, we changed the task distribution to reduce data transfer in disaggregation and ensure optimal performance in all cases. ## Classical PSHA The major new feature in Classical PSHA is the ability to define custom post-processors as documented in https://docs.openquake.org/oq-engine/advanced/3.17/classical_PSHA.html#the-post-processing-framework-and-vector-valued-psha-calculations We took advantages of this feature to reimplement the conditional spectrum calculator as a post-processor. Vector PSHA is also implemented as a post-processor, and so the AELO disaggregations by source. Users can easily implement custom postprocessors by following the examples in https://github.com/gem/oq-engine/tree/engine-3.17/openquake/calculators/postproc Apart from post-processors, we kept working on the calculator to make sure that even the largest model on the GEM mosaic can run with a limited amount of RAM. We also changed the internal `_poes` storage to reduce the disk space occupation. It is now possible to set at the same time both the `sites` parameter and the `site_model_file` parameter; previously it was an error, now instead the closest vs30 parameters are used for the specified sites and the zpt site parameters are recomputed. The change was required in the context of the AELO project. Finally, we needed a new feature to support the 2018 USA model. In that model some multipoint sources use the equivalent distance approximation, while others do not. To manage such situations we added a new parameter in the job.ini, `reqv_ignore_sources`, which is a list of IDs for the sources that should not use the equivalent distance approximation. ## Event Based Hazard The mechanism to generate ruptures in event based calculations has been changed and now different ruptures will be sampled with respect to the past. There is not only a change in the random seed generation algorithm but also a substantial change of approach: starting from engine 3.17 ruptures are *first sampled and then filtered* which is the *opposite* order with respect to the previous versions. The previous order was more efficient but it was making it impossible to define a meaningful rupture ID. The problem is documented in detail in https://docs.openquake.org/oq-engine/advanced/3.17/event_based.html#rupture-sampling-how-to-get-it-wrong Here we will just note that previous versions of the engine had a sequential rupture ID, exported in the file `ruptures.csv`, but was not usable, because by changing by a little the minimum magnitude or the maximum distance the IDs would refer to totally different ruptures. Moreover, there was no easy way to export a rupture given the rupture ID. All this has changed now, and the rupture ID does not depend on the details of the filtering anymore; as a consequence it is finally possible to disaggregate the GMFs (and the risk) by rupture, a much desired feature. The command to extract a single rupture is as simple as `oq extract ruptures?rup_id=XXX` and there is an associated HTTP API. The new rupture ID is a 60 bit integer, with the first 30 bits referring to the index of the source generating the rupture and the second 30 bits referring to the index of the rupture inside the source. Sources containing more than 2^30=1,073,741,824 ruptures are now rejected. Notice that this is not a limitation, since you can always split a large source. Also, in the entire GEM mosaic there is not a single source with more than 2^30 ruptures, so we did not break any model. We fixed sampling of the ruptures both in the case of mutex ruptures (relevant for the USA model around the New Madrid cluster) and in the case of mutex sources (relevant for the Japan model). The sampling for mutex sources was extended to include the `grp_probability` parameters, thus making it possible for the first time to sample the Japan model correctly. There was some work to improve the performance of multiFault sampling by splitting the sources and thus parallelizing the sampling: the improvement was spectacular in the USA model (over 100x on a machine with 120 cores). Moreover, we strongly optimized the generation of events from the ruptures, which now can be dozens of times faster than before. We improved the view `extreme_gmvs` to display the largest ground motion values, and we added a new parameter `extreme_gmv` (a dictionary IMT->threshold) to discard the largest GMVs on demand. This a feature to use with care and not recommended: in presence of extreme GMVs one should fix the GMPE instead. Finally, we added an output "Annual Frequency of Events" which can be exported as a file `event_based_mfd.csv` with fields (mag, freq), which is meant to be compared with the measured magnitude-frequency distribution. ## Additions to hazardlib The most important event in hazardlib was the porting of the GMPEs required to run the Canada SHM6 model. The porting from the GMPEs used in engine 3.11 was long and difficult, since there are many complex GMPEs. In the process we discovered and fixed several subtle bugs. The support for the Canada SHM6 is still considered experimental and you should report any suspicious discrepancy with respect with a calculation performed with engine 3.11. Moreover there was work for supporting the release of the GEM 2023 Global Hazard Mosaic. All the models were computed by converting the GMPEs with nontrivial horizontal components to use the geometric average. That involved specifying `horiz_comp_to_geom_mean = true` in the job.ini file and required to fix a few GMPEs by adding the missing `DEFINED_FOR_INTENSITY_MEASURE_COMPONENT` class attribute. Apart from that, we added a parameter `infer_occur_rates` to speedup rupture sampling in presence of multiFault sources. If you know that the `probs_occur` in the multiFault sources are actually poissonian (which is the common case) then you can set `infer_occur_rates=true` and get an order of magnitude speedup in the sampling procedure. This feature should be considered experimental at the moment and by default is disabled. Marco Pagani added the ability to pass the `kappa0` parameter to the Lanzano 2019 GMPE. As usual, some features were contributed by our users. Chris DiCaprio from New Zealand added to the sourcewriter the ability to add a prefix to MultiFaultSource IDs. Julián Santiago Montejo contributed the GMPE Arteta et al. (2023) and a bug fix the the GMPE Arteta et al. (2021). Graeme Weatherill contribute a few bug fixes to various GMPEs, already backported to engine 3.16 (https://github.com/gem/oq-engine/pull/8778). ## WebUI In the context of the AELO project, we extended the WebUI to make it possible to run single-site classical calculations including disaggregation by `Mag_Dist_Eps` and disaggregation by relevant sources for each site in the world. The calculator automatically determine the model to use from the coordinates of the site and recompute the site parameters starting from the user-provided vs30. The performance has been tuned so much that a machine with only 16 GB and 4 cores is enough to run such calculations. The WebUI was extended to log information on login/logout so that tools like [fail2ban](https://github.com/fail2ban/fail2ban) can be used to detect and stop denial-of-service attacks. We also worked on the password-reset facility. We also fixed the method `/v1/ini_defaults` returning a JSON with the defaults used in the job.ini file. The issue was that it was returning NaNs for some site parameters, i.e. not a valid JSON. The solution was to not return anything for such parameters. We introduced a READ_ONLY mode for the WebUI in situations when it is advisable to remove the ability to post calculations (for instance when the machine were the WebUI runs has not enough resources to spawn calculations). ## Risk We have a few new features in the risk calculators too. The first one is the introduction of three new loss types and vulnerability functions: `area`, `number` and `residents`. It is enough to specify the corresponding vulnerability functions in the job.ini file and the outputs relative to the new loss types will appear are new columns in the loss outputs CSV files. The `area` and `number` loss types are associated to the corresponding fields in the exposure, while the `residents` loss type is associated to the field `avg_occupants` in the exposure; if missing, it is automatically computed as the mean of night, day and transit occupants. The second important new feature is disaggregation by rupture. The command to give is simply `$ oq show risk_by_rup:structural ` which will return a DataFrame keyed by the rupture ID and with fields including the total loss generated by the rupture and the parameters of the rupture (magnitude, number of occurrences and hypocenter). ``` loss mag n_occ lon lat dep rup_id 52076478464016 3.091906e+12 6.60 9 128.401993 27.496000 68.199997 52037823758352 2.223281e+12 6.60 9 128.501999 27.596001 70.000000 ``` We changed the (re)insurance calculator to read the deductible field from the exposure: this gives a speedup of multiple orders of magnitude in cases where you have thousands of assets all with the same limit and different deductibles. This was the common case for our sponsor SURA, so we switched from a slow and memory-consuming calculation aggregating 6 million different policies to an ultra-fast calculation aggregating only 6 policies. At user request, we extended the `total_losses` feature to include `business_interruption` and we improved the error handling for reinsurance calculations. There was also a major performance improvement when reading the exposure. The issue was that we could not read the USA exposure (22 million assets) without running out of memory; now we can and the exposure processor is three times faster. We also improved the event based risk calculator to use very little memory on the workers even in presence of a huge exposure. There was a huge amount of work on the conditioned GMF calculator, for which we included a [comprehensive verification test suite] (https://github.com/gem/oq-engine/pull/8542) based on the tests for the [ShakeMap code by USGS] (https://usgs.github.io/shakemap/manual4_0/tg_verification.html) We also changed the calculator to compute the GMFs only on the hazard sites and not on the seismic stations. At user request, we changed the behavior of `scenario_risk` calculations to just print a warning in case of small hazard causing to losses, and not to raise an error. This is consistent with the behavior of `scenario_damage` calculations Finally, we added an environment variable OQ_SAMPLE_ASSETS which is useful when debugging calculations; for instance you can set OQ_SAMPLE_ASSETS=.001 to reduce a calculation by 1000 times, which will be much faster and easier to debug. ## Bug fixes and new checks We reimplemented the `minimum_magnitude` filter in terms of the mag-dependent-distance filter, which can also be used to implement a `maximum_magnitude` filter. The change eliminated a bunch of bugs and corner cases that kept creeping in. The engine was rounding longitude and latitude to 5 digits everywhere, except when generating a grid from a region; this has been fixed, and we have full consistency now. The `truncation_level` parameter is now mandatory in classical and event based calculations. Before, if the user forgot to set it, a default value of 99 was used, most likely not what the users wanted. If the same parameter is present in different sections of the job.ini file, now the engine raises an error. If the user by mistake disable the statistics and does not specify `individual_rlzs=true`, now gets an early error; before, the engine was computing the individual realizations (which was an error). If the exposure contains empty asset IDs, now the engine raises a clear error. If the site model file does not contain a required site parameter, and the parameter is not set in the job.ini either, now a clear error is used instead of using incorrectly a NaN value. The New Zealand model uses the `CScalingMSR` magnitude-scaling-relationship. That caused a surprising error when using the `ps_grid_spacing` approximation due to the way the names of the CollapsedPointSources were generated. This is now fixed since the name does not contain anymore the name of the MSR. It is now possible to use a percent character in the job.ini file without interfering with the ConfigParser interpolation feature. The GMF importer was failing to import GMFs in CSV format with a `custom_site_id` field: this has been fixed. In case of errors, the engine was printing the same traceback two or even three times: this has been fixed. ## oq commands We finally removed the command `oq celery`, since we have deprecated celery for over 5 years. We also removed the obsolete references to celery and rabbitmq in the engine documentation. If a calculation ID is already taken in the database, now `oq importcalc` raises a clear error before importing the HDF5 file. We enhanced the command `oq sample` which is now able to sample MultiPointSources too. We extended the command `oq shakemap2gmfs`: on demand it can amplify the generated GMFs depending on the ShakeMap vs30 or on the site model vs30. See the `--help` message for more. We refined the plotting command `oq plot event_based_mfd?` to plot the magnitude-frequency distribution of the events in an event based calculation. We updated the command `oq plot uhs_cluster?k=XXX` to clusterize together uniform hazard spectra given similar or identical hazard. We added a new command `oq compare med_gmv ` which is able to compare the median GMVs between two calculation: this is useful when debugging different versions of the engine. We added a new command `oq reduce_smlt` which is able to reduce source model logic tree file and associated source mode files to a single source; this is useful when debugging AELO calculations. We introduced a new family of commands `oq mosaic` with the following two subcommands: 1. `oq mosaic run_site` to run a classical calculation on the given longitude,latitude site. 2. `oq mosaic sample_gmfs` to sample the ruptures on a given mosaic model ## Other We extended the `sensitivity_analysis` feature to work also on file parameters: for instance, you can provide multiple different logic tree files to study the sensitivity from the logic tree. There was a lot of work on documentation, both on the manuals (fully documented `aggregate_by` and `ideductible` and `maximum_distance`), and the FAQs (how configure multiple engine installations). We also improved the installation instructions. The full report output has been revised and now a compact representation for the involved GMPEs is used when needed, to avoid unreadable extra-long lines.