gwgen.main module

Classes

GWGENOrganizer([config]) A class for organizing a model

Functions

exponential_function(x, a, b) Exponential function used by GWGENOrganizer.wind_bias_correction()
logistic_function(x, L, k, x0) Logistic function used in GWGENOrganizer.wind_bias_correction()
main([args]) Call the main() method of the
class gwgen.main.GWGENOrganizer(config=None)[source]

Bases: model_organization.ModelOrganizer

A class for organizing a model

This class is indended to have hold the basic functions for organizing a model. You can subclass the functions setup, init to fit to your model. When using the model from the command line, you can also use the setup_parser() method to create the argument parsers

Parameters:config (model_organization.config.Config) – The configuration of the organizer

Methods

bias_correction([keep, quantiles, ...]) Perform a bias correction for the data
cloud_preproc([max_files, return_manager]) Extract the inventory of EECRA stations
compile_model([projectname]) Compile the model
configure([update_nml, max_stations, ...]) Configure the projects and experiments
create_test_sample(test_dir, stations[, ...]) Create a test sample for the given GHCN stations
evaluate([stations, other_exp, setup_from, ...]) Evaluate the experiment
param([complete, stations, other_exp, ...]) Parameterize the experiment
poly_bias_correction(vname, what, info[, ...]) Perform a bias correction based on percentile and a polynomial fit
preproc(**kwargs) Preprocess the data
run([ifile, ofile, odir, work_dir, remove]) Run the experiment
select([grid, grid_output, stations_output, ...]) Select stations based upon a regular grid
sensitivity_analysis(**kwargs) Perform a sensitivity analysis on the given parameters
setup(root_dir[, projectname, link, ...]) Perform the initial setup for the model
tmin_bias_correction(*args, **kwargs) Perform a bias correction for the minimum temperature data
wind_bias_correction(*args, **kwargs) Perform a bias correction for the wind speed
wind_bias_correction_logistic(info[, ...]) Perform a bias correction for the data

Attributes

bias_correction_methods
commands list() -> new empty list
name str(object=’‘) -> string
parser_commands mapping from the name of the parser command to the method name
paths list of str. The keys describing paths for the model
preproc_funcs A mapping from preproc commands to the corresponding function
bias_correction(keep=False, quantiles=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99], no_evaluation=False, new_project=False, **kwargs)[source]

Perform a bias correction for the data

Parameters:
  • keep (bool) – If not True, the experiment configuration files are not modified. Otherwise the quants section is kept for the given quantiles
  • quantiles (list of float) – The quantiles to use for the bias correction. Does not have an effect if no_evaluation is set to True
  • no_evaluation (bool) – If True, the existing evaluation in the configuration is used for the bias correction
  • new_project (bool) – If True, a new project will be created even if a file in project_output exists already
Returns:

The results of the underlying bias correction methods

Return type:

dict

bias_correction_methods
cloud_preproc(max_files=None, return_manager=False, **kwargs)[source]

Extract the inventory of EECRA stations

Parameters:
  • max_files (int) – The maximum number of files to process during one process. If None, it is determined by the global 'max_stations' key
  • **kwargs – Any task in the gwgen.preproc.CloudPreproc framework
commands = ['setup', 'compile_model', 'init', 'set_value', 'get_value', 'del_value', 'info', 'unarchive', 'configure', 'preproc', 'param', 'run', 'evaluate', 'bias_correction', 'sensitivity_analysis', 'archive', 'remove']
compile_model(projectname=None, **kwargs)[source]

Compile the model

Parameters:
  • projectname (str) – The name of the project. If None, use the last one or the one specified by the current experiment
  • **kwargs – Keyword arguments passed to the app_main() method
configure(update_nml=None, max_stations=None, datadir=None, database=None, user=None, host=None, port=None, chunksize=None, compiler=None, **kwargs)[source]

Configure the projects and experiments

Parameters:
  • global_config (bool) – If True/set, the configuration are applied globally (already existing and configured experiments are not impacted)
  • project_config (bool) – Apply the configuration on the entire project instance instead of only the single experiment (already existing and configured experiments are not impacted)
  • ifile (str) – The input file for the project. Must be a netCDF file with population data
  • forcing (str) – The input file for the project containing variables with population evolution information. Possible variables in the netCDF file are movement containing the number of people to move and change containing the population change (positive or negative)
  • serial (bool) – Do the parameterization always serial (i.e. not in parallel on multiple processors). Does automatically impact global settings
  • nprocs (int or 'all') – Maximum number of processes to when making the parameterization in parallel. Does automatically impact global settings and disables serial
  • update_from (str) – Path to a yaml configuration file to update the specified configuration with it
  • **kwargs – Other keywords for the app_main() method
  • update_nml (str or dict) – A python dict or path to a namelist to use for updating the namelist of the experiment
  • max_stations (int) – The maximum number of stations to process in one parameterization process. Does automatically impact global settings
  • datadir (str) – Path to the data directory to use (impacts the project configuration)
  • database (str) – The name of a postgres data base to write the data to
  • user (str) – The username to use when logging into the database
  • host (str) – the host which runs the database server
  • port (int) – The port to use to log into the the database
  • chunksize (int) – The chunksize to use for the parameterization and evaluation
  • compiler (str) – The path to the fortran compiler to use
create_test_sample(test_dir, stations, no_cloud=False, reduce_eecra=0, keep_all=False)[source]

Create a test sample for the given GHCN stations

Parameters:
  • test_dir (str) – The path to the directory containing the test files from Github
  • stations (str or list of str) – either a list of GHCN stations to use or a filename containing a 1-row table with GHCN stations
  • no_cloud (bool) – If True, no cloud stations are extracted
  • reduce_eecra (float) – The percentage by which to reduce the EECRA data
  • keep_all (bool) – If True all years of the EECRA data are used. Otherwise, only the years with complete temperature and cloud are kept. Note that this has only an effect if reduce_eecra is not 0
evaluate(stations=None, other_exp=None, setup_from=None, to_db=None, to_csv=None, database=None, norun=False, to_return=None, complete=False, **kwargs)[source]

Evaluate the experiment

Parameters:
  • stations (str or list of str) – either a list of stations to use or a filename containing a 1-row table with stations
  • other_exp (str) – Use the configuration from another experiment
  • setup_from (str) – Determine where to get the data from. If scratch, the data will be calculated from the raw data. If file, the data will be loaded from a file, if db, the data will be loaded from a postgres database (Note that the database argument must be provided!).
  • to_db (bool) – Save the data into a postgresql database (Note that the database argument must be provided!)
  • to_csv (bool) – Save the data into a csv file
  • database (str) – The name of a postgres data base to write the data to
  • norun (bool, list of str or 'all') – If True, only the data is set up and the configuration of the experiment is not affected. It can be either a list of tasks or True or 'all'
  • to_return (list of str or 'all') – The names of the tasks to return. If None, only the ones with an gwgen.utils.TaskBase.has_run are returned.
  • complete (bool) – If True, setup and run all possible tasks
name = 'gwgen'
param(complete=False, stations=None, other_exp=None, setup_from=None, to_db=None, to_csv=None, database=None, norun=False, to_return=None, **kwargs)[source]

Parameterize the experiment

Parameters:
  • stations (str or list of str) – either a list of stations to use or a filename containing a 1-row table with stations
  • other_exp (str) – Use the configuration from another experiment
  • setup_from (str) – Determine where to get the data from. If scratch, the data will be calculated from the raw data. If file, the data will be loaded from a file, if db, the data will be loaded from a postgres database (Note that the database argument must be provided!).
  • to_db (bool) – Save the data into a postgresql database (Note that the database argument must be provided!)
  • to_csv (bool) – Save the data into a csv file
  • database (str) – The name of a postgres data base to write the data to
  • norun (bool, list of str or 'all') – If True, only the data is set up and the configuration of the experiment is not affected. It can be either a list of tasks or True or 'all'
  • to_return (list of str or 'all') – The names of the tasks to return. If None, only the ones with an gwgen.utils.TaskBase.has_run are returned.
  • complete (bool) – If True, setup and run all possible tasks
parser_commands = {'bias_correction': 'bias', 'compile_model': 'compile', 'sensitivity_analysis': 'sens'}

mapping from the name of the parser command to the method name

paths = ['expdir', 'src', 'data', 'param_stations', 'eval_stations', 'indir', 'input', 'outdir', 'outdata', 'nc_file', 'project_file', 'plot_file', 'reference', 'evaldir', 'paramdir', 'workdir', 'param_grid', 'grid', 'eval_grid']

list of str. The keys describing paths for the model

poly_bias_correction(vname, what, info, new_project=False, plot_output=None, deg=3, close=True, ds=None)[source]

Perform a bias correction based on percentile and a polynomial fit

Parameters:
  • vname (str) – The variable name to use
  • what (str { 'slope' | 'intercept' }) – Either slope or intercept. The parameter that should be used for the bias correction
  • info (dict) – The configuration of the quantile evaluation
  • new_project (bool) – If True, a new project will be created even if a file in project_output exists already
  • plot_output (str) – The name of the output file. If not specified, it defaults to <exp_dir>/postproc/<vname>_bias_correction.pdf
  • deg (int) – The degree of the fittet polynomial
  • close (bool) – If True, close the project at the end
  • ds (xr.Dataset) – The xarray dataset to use. Otherwise it will be created from info
preproc(**kwargs)[source]

Preprocess the data

Parameters:**kwargs – Any keyword from the preproc attribute with kws for the corresponding function, or any keyword for the main() method
preproc_funcs

A mapping from preproc commands to the corresponding function

run(ifile=None, ofile=None, odir=None, work_dir=None, remove=False, **kwargs)[source]

Run the experiment

Parameters:
  • ifile (str) – The path to the input file. If None, it is assumed that it is stored in the 'input' key in the experiment configuration
  • ofile (str) – The path to the output file. If None, it is assumed that it is stored in the 'input' key in the experiment configuration or it will be stored in 'odir/exp_id.csv'. The output directory 'odir' is determined by the odir parameter
  • odir (str) – The path to the output directory. If None and not already saved in the configuration, it will default to 'experiment_dir/outdata'
  • work_dir (str) – The path to the work directory where the binaries are copied to.
  • remove (bool) – If True, the work_dir will be removed if it already exists
Other Parameters:
 

``**kwargs`` – Will be passed to the main() method

select(grid=None, grid_output=None, stations_output=None, igrid_key=None, grid_key=None, grid_db=None, stations_db=None, no_prcp_check=False, setup_from=None, download=None, **kwargs)[source]

Select stations based upon a regular grid

Parameters:
  • grid (str) – The path to a csv-file containing a lat and a lon column with the information on the centers of the grid. If None, igrid_key must not be None and point to a key in the configuration (either the one of the experiment, or the project, or the global configuration) specifying the path
  • grid_output (str) – The path to the csv-file where to store the mapping from grid lat-lon to station id.
  • stations_output (str) – The path to the csv-file where to store the mapping from station to grid center point
  • igrid_key (str) – The key in the configuration where to store the path of the grid input file
  • grid_key (str) – The key in the configuration where to store the name of the grid_output file.
  • grid_db (str) – The name of a data table to store the data of stations_output in
  • stations_db (str) – The name of a data table to store the data for stations_output in
  • no_prcp_check (bool) – If True, we will not check for the values between 0.1 and 1.0 for precipitation and save the result in the 'best' column
  • setup_from ({ 'scratch' | 'file' | 'db' }) – The setup method for the daily data for the prcp check
  • download ({ 'single' | 'all' }) – Handles how to manage missing files for the prcp check. If None (default), an warning is printed and the file is ignored, if 'single', the missing file is downloaded, if 'all', the entire tarball is downloaded (strongly not recommended for this function)
Other Parameters:
 

``**kwargs`` – are passed to the main() method

Notes

for igrid_key and ogrid_key we recommend one of {'grid', 'param_grid', 'eval_grid' because that implies a correct path management

sensitivity_analysis(**kwargs)[source]

Perform a sensitivity analysis on the given parameters

This function performs a sensitivity analysis on the current experiment. It creates a new project and uses the evaluation and parameterization of the current experiment to get information on the others

setup(root_dir, projectname=None, link=False, src_project=None, compiler=None, **kwargs)[source]

Perform the initial setup for the model

Parameters:
  • root_dir (str) – The path to the root directory where the experiments, etc. will be stored
  • projectname (str) – The name of the project that shall be initialized at root_dir. A new directory will be created namely root_dir + '/' + projectname
  • link (bool) – If set, the source files are linked to the original ones instead of copied
  • link – If set, the source files are linked to the original ones instead of copied
  • src_project (str) – Another model name to use the source model files from
  • compiler (str) – The path to the compiler to use. If None, the global compiler option is used
tmin_bias_correction(*args, **kwargs)[source]

Perform a bias correction for the minimum temperature data

Parameters:
  • info (dict) – The configuration of the quantile evaluation
  • new_project (bool) – If True, a new project will be created even if a file in project_output exists already
  • plot_output (str) – The name of the output file. If not specified, it defaults to <exp_dir>/postproc/<vname>_bias_correction.pdf
  • deg (int) – The degree of the fittet polynomial
  • close (bool) – If True, close the project at the end
  • ds (xr.Dataset) – The xarray dataset to use. Otherwise it will be created from info
wind_bias_correction(*args, **kwargs)[source]

Perform a bias correction for the wind speed

Parameters:
  • info (dict) – The configuration of the quantile evaluation
  • new_project (bool) – If True, a new project will be created even if a file in project_output exists already
  • plot_output (str) – The name of the output file. If not specified, it defaults to <exp_dir>/postproc/<vname>_bias_correction.pdf
  • close (bool) – If True, close the project at the end
wind_bias_correction_logistic(info, new_project=False, plot_output=None, close=True)[source]

Perform a bias correction for the data

Parameters:
  • info (dict) – The configuration of the quantile evaluation
  • new_project (bool) – If True, a new project will be created even if a file in project_output exists already
  • plot_output (str) – The name of the output file. If not specified, it defaults to <exp_dir>/postproc/<vname>_bias_correction.pdf
  • close (bool) – If True, close the project at the end
gwgen.main.exponential_function(x, a, b)[source]

Exponential function used by GWGENOrganizer.wind_bias_correction()

This function is defined as

\[f(x) = e^{ax + b}\]
Parameters:
  • x (numpy.ndarray) – The x-data
  • a (float) – The a parameter in the above equation
  • b (float) – The b parameter in the above equation
Returns:

The calculated \(f(x)\)

Return type:

np.ndarray

gwgen.main.logistic_function(x, L, k, x0)[source]

Logistic function used in GWGENOrganizer.wind_bias_correction()

The function is defined as

\[f(x) = \frac{L}{1 + \mathrm e^{-k(x-x_0)}}\]
Parameters:
  • x (numpy.ndarray) – The x-data
  • L (float) – the curve’s maximum value
  • k (float) – The steepness of the curve
  • x0 (the x-value of the sigmoid's midpoint) –
Returns:

The calculated \(f(x)\)

Return type:

np.ndarray

gwgen.main.main(args=None)[source]

Call the main() method of the GWGENOrganizer class