gwgen.evaluation module

Evaluation module of the gwgen module


EvaluationPreparation(*args, **kwargs) Evaluation task to prepare the evaluation
Evaluator(stations, config, project_config, ...) Abstract base class for evaluation tasks
KSEvaluation(*args, **kwargs) Evaluation using a Kolmogorov-Smirnoff test
OutputTask(stations, config, project_config, ...) Task to provide all the data for input and output
QuantileEvaluation(*args, **kwargs) Evaluator to evaluate specific quantiles
SimulationQuality(stations, config, ...[, ...]) Evaluator to provide one value characterizing the quality of the


default_ks_config([no_rounding, names, ...]) The default configuration for KSEvaluation instances.
default_preparation_config([setup_raw, ...]) The default configuration for EvaluationPreparation instances.
default_quality_config([quantiles]) The default configuration for SimulationQuality instances.
default_quantile_config([quantiles]) The default configuration for QuantileEvaluation instances.
class gwgen.evaluation.EvaluationPreparation(*args, **kwargs)[source]

Bases: gwgen.evaluation.Evaluator

Evaluation task to prepare the evaluation


__reduce__() Reimplemented to give provide also the manager
init_from_scratch() Initialize the setup via the parameterization classes
setup_from_db(*args, **kwargs)
setup_from_file(*args, **kwargs)
write2db(*args, **kwargs) Reimplemented to sort the data according to the index
write2file(*args, **kwargs) Reimplemented to sort the data according to the index


datafile The paths to reference and input file
dbname list() -> new empty list
has_run bool(x) -> bool
http_inventory str(object=’‘) -> string
input_data The input DataFrame
name str(object=’‘) -> string
reference_data The reference DataFrame
summary str(object=’‘) -> string

Reimplemented to give provide also the manager


The paths to reference and input file

dbname = ['reference', 'input']
has_run = False
http_inventory = ''

Initialize the setup via the parameterization classes


The input DataFrame

name = 'prepare'

The reference DataFrame

setup_from_db(*args, **kwargs)[source]
setup_from_file(*args, **kwargs)[source]
summary = 'Prepare the for experiment for evaluation'
write2db(*args, **kwargs)[source]

Reimplemented to sort the data according to the index

write2file(*args, **kwargs)[source]

Reimplemented to sort the data according to the index

class gwgen.evaluation.Evaluator(stations, config, project_config, global_config, data=None, requirements=None, *args, **kwargs)[source]

Bases: gwgen.utils.TaskBase

Abstract base class for evaluation tasks

Evaluation tasks should incorporate a run method that is called by the gwgen.main.GWGENOrganizer.evaluate() method

  • stations (list) – The list of stations to process
  • config (dict) – The configuration of the experiment
  • project_config (dict) – The configuration of the underlying project
  • global_config (dict) – The global configuration
  • data (pandas.DataFrame) – The data to use. If None, use the setup() method
  • requirements (list of TaskBase instances) – The required instances. If None, you must call the set_requirements() method later
Other Parameters:

``*args, **kwargs`` – The configuration of the task. See the TaskConfig for arguments. Note that if you provide *args, you have to provide all possible arguments


task_data_dir The directory where to store data

The directory where to store data

class gwgen.evaluation.KSEvaluation(*args, **kwargs)[source]

Bases: gwgen.evaluation.QuantileEvaluation

Evaluation using a Kolmogorov-Smirnoff test


run(info) Run the evaluation
significance_fractions(series) The percentage of stations with no significant difference


dbname str(object=’‘) -> string
name str(object=’‘) -> string
output output parameterization instance
prepare prepare parameterization instance
requires list() -> new empty list
summary str(object=’‘) -> string
static calc(group)[source]
dbname = 'kolmogorov_evaluation'
name = 'ks'

output parameterization instance


prepare parameterization instance

requires = ['prepare', 'output']

Run the evaluation

Parameters:info (dict) – The configuration dictionary

The percentage of stations with no significant difference

summary = 'Perform a kolmogorov smirnoff test'
class gwgen.evaluation.OutputTask(stations, config, project_config, global_config, data=None, requirements=None, *args, **kwargs)[source]

Bases: gwgen.evaluation.Evaluator

Task to provide all the data for input and output

  • stations (list) – The list of stations to process
  • config (dict) – The configuration of the experiment
  • project_config (dict) – The configuration of the underlying project
  • global_config (dict) – The global configuration
  • data (pandas.DataFrame) – The data to use. If None, use the setup() method
  • requirements (list of TaskBase instances) – The required instances. If None, you must call the set_requirements() method later
Other Parameters:

``*args, **kwargs`` – The configuration of the task. See the TaskConfig for arguments. Note that if you provide *args, you have to provide all possible arguments


dbname str(object=’‘) -> string
has_run bool(x) -> bool
name str(object=’‘) -> string
summary str(object=’‘) -> string


setup_from_db(*args, **kwargs)
setup_from_file(*args, **kwargs)
write2file(*args, **kwargs) Not implemented since the output file is generated by the model!
dbname = 'output'
has_run = False
name = 'output'
setup_from_db(*args, **kwargs)[source]
setup_from_file(*args, **kwargs)[source]
summary = 'Load the output of the model'
write2file(*args, **kwargs)[source]

Not implemented since the output file is generated by the model!

class gwgen.evaluation.QuantileEvaluation(*args, **kwargs)[source]

Bases: gwgen.evaluation.Evaluator

Evaluator to evaluate specific quantiles


dbname str(object=’‘) -> string
ds The dataset of the quantiles
fmt default formatoptions for the
has_run bool(x) -> bool
kwargs default formatoptions for the
name str(object=’‘) -> string
names Dictionary that remembers insertion order
output output parameterization instance
prepare prepare parameterization instance
setup_requires list() -> new empty list
summary str(object=’‘) -> string


make_run_config(sp, info)
round_to_ref_prec(ref, sim[, func]) Round one array to the precision of another
setup_from_db(*args, **kwargs)
setup_from_file(*args, **kwargs)
dbname = 'quantile_evaluation'

The dataset of the quantiles

fmt = {'cbar': '', 'yrange': (['minmax', 1], ['minmax', 99]), 'title': '%(pctl)sth percentile', 'xrange': (['minmax', 1], ['minmax', 99]), 'bounds': ['minmax', 11, 0, 99], 'sym_lims': 'max', 'ideal': [0, 1], 'cmap': 'w_Reds', 'xlabel': '%(type)s {desc}', 'legendlabels': ['$R^2$ = %(rsquared)s'], 'ylabel': '%(type)s {desc}', 'id_color': 'r', 'legend': {'loc': 'upper left'}, 'bins': 10}

default formatoptions for the psyplot.plotter.linreg.DensityRegPlotter plotter

has_run = True
kwargs = {'cbar': '', 'yrange': (['minmax', 1], ['minmax', 99]), 'title': '%(pctl)sth percentile', 'xrange': (['minmax', 1], ['minmax', 99]), 'bounds': ['minmax', 11, 0, 99], 'sym_lims': 'max', 'ideal': [0, 1], 'cmap': 'w_Reds', 'xlabel': '%(type)s {desc}', 'legendlabels': ['$R^2$ = %(rsquared)s'], 'ylabel': '%(type)s {desc}', 'id_color': 'r', 'legend': {'loc': 'upper left'}, 'bins': 10}

default formatoptions for the psyplot.plotter.linreg.DensityRegPlotter plotter

make_run_config(sp, info)[source]
name = 'quants'
names = OrderedDict([('prcp', {'units': 'mm', 'long_name': 'Precipitation'}), ('tmin', {'units': 'degC', 'long_name': 'Min. Temperature'}), ('tmax', {'units': 'degC', 'long_name': 'Max. Temperature'}), ('mean_cloud', {'units': '-', 'long_name': 'Cloud fraction'}), ('wind', {'units': 'm/s', 'long_name': 'Wind Speed'})])

output parameterization instance


prepare parameterization instance

static round_to_ref_prec(ref, sim, func=<ufunc 'ceil'>)[source]

Round one array to the precision of another

  • ref (np.ndarray) – The reference array to get the precision from
  • sim (np.ndarray) – The simulated array to round
  • func (function) – The rounding function to use

Rounded sim

Return type:


setup_from_db(*args, **kwargs)[source]
setup_from_file(*args, **kwargs)[source]
setup_requires = ['prepare', 'output']
summary = 'Compare the quantiles of simulation and observation'
class gwgen.evaluation.SimulationQuality(stations, config, project_config, global_config, data=None, requirements=None, *args, **kwargs)[source]

Bases: gwgen.evaluation.Evaluator

Evaluator to provide one value characterizing the quality of the experiment

The applied metric is the mean of

\[\begin{split}m = \left<\{\left<\{R^2_q\}_{q\in Q}\right>, \left<\{1 - |1 - a_q|\}_{q\in Q}\right>, \{ks\}\}\right>\end{split}\]


has_run bool(x) -> bool
name str(object=’‘) -> string
summary str(object=’‘) -> string


setup_from_scratch() Only sets an empty dataframe

where \(\left<\right>\) denotes the mean of the enclosed set, \(q\in Q\) are the quantiles from the quantile evaluation, \(R^2_q\) the corresponding coefficient of determination and \(a_q\) the slope of quantile \(q\). \(ks\) denotes the fraction of stations that do not differ significantly from the observations according to the ks test.

In other words, this quality estimate is the mean of the

  1. coefficients of determination
  2. the deviation from the ideal slope (\(a_q == 1\)) and
  3. the fraction of stations that do not differ significantly

Hence, a value of 1 mean high quality, a value of 0 low quality

  • stations (list) – The list of stations to process
  • config (dict) – The configuration of the experiment
  • project_config (dict) – The configuration of the underlying project
  • global_config (dict) – The global configuration
  • data (pandas.DataFrame) – The data to use. If None, use the setup() method
  • requirements (list of TaskBase instances) – The required instances. If None, you must call the set_requirements() method later
Other Parameters:

``*args, **kwargs`` – The configuration of the task. See the TaskConfig for arguments. Note that if you provide *args, you have to provide all possible arguments

has_run = True
name = 'quality'

Only sets an empty dataframe

summary = 'Estimate simulation quality using ks and quantile evaluation'
gwgen.evaluation.default_ks_config(no_rounding=False, names=None, transform_wind=False, *args, **kwargs)[source]

The default configuration for KSEvaluation instances. See also the KSEvaluation.default_config attribute

  • no_rounding (bool) – Do not round the simulation to the infered precision of the reference. The inferred precision is the minimum difference between two values with in the entire data
  • names (list of str) – The list of variables use for calculation. If None, all variables will be used
  • transform_wind (bool) – If True, the square root of the wind is evaluated (as this is also simulated in the weather generator)
  • setup_from ({ 'scratch' | 'file' | 'db' | None }) –

    The method how to setup the instance either from

    To set up the task from the raw data
    Set up the task from an existing file
    Set up the task from a database
    If the file name of this this task exists, use this one, otherwise a database is provided, use this one, otherwise go from scratch
  • to_csv (bool) – If True, the data at setup will be written to a csv file
  • to_db (bool) – If True, the data at setup will be written to into a database
  • remove (bool) – If True and the old data file already exists, remove before writing to it
  • skip_filtering (bool) – If True, skip the filtering for the correct stations in the datafile
  • plot_output (str) – An alternative path to use for the PDF file of the plot
  • nc_output (str) – An alternative path (or multiples depending on the task) to use for the netCDF file of the plot data
  • project_output (str) – An alternative path to use for the psyplot project file of the plot
  • new_project (bool) – If True, a new project will be created even if a file in project_output exists already
  • project (str) – The path to a psyplot project file to use for this parameterization
  • close (bool) – Close the project at the end
gwgen.evaluation.default_preparation_config(setup_raw=None, raw2db=False, raw2csv=False, reference=None, input_path=None, *args, **kwargs)[source]

The default configuration for EvaluationPreparation instances. See also the EvaluationPreparation.default_config attribute

  • setup_raw ({ 'scratch' | 'file' | 'db' | None }) –

    The method how to setup the raw data from GHCN and EECRA

    To set up the task from the raw data
    Set up the task from an existing file
    Set up the task from a database
    If the file name of this this task exists, use this one, otherwise a database is provided, use this one, otherwise go from scratch
  • raw2db (bool) – If True, the raw data from GHCN and EECRA is stored in a postgres database
  • raw2csv (bool) – If True, the raw data from GHCN and EECRA is stored in a csv file
  • reference (str) – The path of the file where to store the reference data. If None and not already set in the configuration, it will default to 'evaluation/reference.csv'
  • input_path (str) – The path of the file where to store the model input. If None, and not already set in the configuration, it will default to 'inputdir/input.csv' where inputdir is the path to the input directory (by default, input in the experiment directory)
  • setup_from ({ 'scratch' | 'file' | 'db' | None }) –

    The method how to setup the instance either from

    To set up the task from the raw data
    Set up the task from an existing file
    Set up the task from a database
    If the file name of this this task exists, use this one, otherwise a database is provided, use this one, otherwise go from scratch
  • to_csv (bool) – If True, the data at setup will be written to a csv file
  • to_db (bool) – If True, the data at setup will be written to into a database
  • remove (bool) – If True and the old data file already exists, remove before writing to it
  • skip_filtering (bool) – If True, skip the filtering for the correct stations in the datafile
  • plot_output (str) – An alternative path to use for the PDF file of the plot
  • nc_output (str) – An alternative path (or multiples depending on the task) to use for the netCDF file of the plot data
  • project_output (str) – An alternative path to use for the psyplot project file of the plot
  • new_project (bool) – If True, a new project will be created even if a file in project_output exists already
  • project (str) – The path to a psyplot project file to use for this parameterization
  • close (bool) – Close the project at the end
gwgen.evaluation.default_quality_config(quantiles=None, *args, **kwargs)[source]

The default configuration for SimulationQuality instances. See also the SimulationQuality.default_config attribute

  • quantiles (list of floats) – The quantiles to use for the quality analysis
  • setup_from ({ 'scratch' | 'file' | 'db' | None }) –

    The method how to setup the instance either from

    To set up the task from the raw data
    Set up the task from an existing file
    Set up the task from a database
    If the file name of this this task exists, use this one, otherwise a database is provided, use this one, otherwise go from scratch
  • to_csv (bool) – If True, the data at setup will be written to a csv file
  • to_db (bool) – If True, the data at setup will be written to into a database
  • remove (bool) – If True and the old data file already exists, remove before writing to it
  • skip_filtering (bool) – If True, skip the filtering for the correct stations in the datafile
  • plot_output (str) – An alternative path to use for the PDF file of the plot
  • nc_output (str) – An alternative path (or multiples depending on the task) to use for the netCDF file of the plot data
  • project_output (str) – An alternative path to use for the psyplot project file of the plot
  • new_project (bool) – If True, a new project will be created even if a file in project_output exists already
  • project (str) – The path to a psyplot project file to use for this parameterization
  • close (bool) – Close the project at the end
gwgen.evaluation.default_quantile_config(quantiles=[1, 5, 10, 25, 50, 75, 90, 95, 99, 100], *args, **kwargs)[source]

The default configuration for QuantileEvaluation instances. See also the QuantileEvaluation.default_config attribute

  • no_rounding (bool) – Do not round the simulation to the infered precision of the reference. The inferred precision is the minimum difference between two values with in the entire data
  • names (list of str) – The list of variables use for calculation. If None, all variables will be used
  • transform_wind (bool) – If True, the square root of the wind is evaluated (as this is also simulated in the weather generator)
  • quantiles (list of floats) – The quantiles to use for calculating the percentiles
  • setup_from ({ 'scratch' | 'file' | 'db' | None }) –

    The method how to setup the instance either from

    To set up the task from the raw data
    Set up the task from an existing file
    Set up the task from a database
    If the file name of this this task exists, use this one, otherwise a database is provided, use this one, otherwise go from scratch
  • to_csv (bool) – If True, the data at setup will be written to a csv file
  • to_db (bool) – If True, the data at setup will be written to into a database
  • remove (bool) – If True and the old data file already exists, remove before writing to it
  • skip_filtering (bool) – If True, skip the filtering for the correct stations in the datafile
  • plot_output (str) – An alternative path to use for the PDF file of the plot
  • nc_output (str) – An alternative path (or multiples depending on the task) to use for the netCDF file of the plot data
  • project_output (str) – An alternative path to use for the psyplot project file of the plot
  • new_project (bool) – If True, a new project will be created even if a file in project_output exists already
  • project (str) – The path to a psyplot project file to use for this parameterization
  • close (bool) – Close the project at the end