gwgen.evaluation module

Evaluation module of the gwgen module

Classes

EvaluationPreparation(*args, **kwargs) Evaluation task to prepare the evaluation
Evaluator(stations, config, project_config, ...) Abstract base class for evaluation tasks
KSEvaluation(*args, **kwargs) Evaluation using a Kolmogorov-Smirnoff test
OutputTask(stations, config, project_config, ...) Task to provide all the data for input and output
QuantileEvaluation(*args, **kwargs) Evaluator to evaluate specific quantiles
SimulationQuality(stations, config, ...[, ...]) Evaluator to provide one value characterizing the quality of the

Functions

default_ks_config([no_rounding, names, ...]) The default configuration for KSEvaluation instances.
default_preparation_config([setup_raw, ...]) The default configuration for EvaluationPreparation instances.
default_quality_config([quantiles]) The default configuration for SimulationQuality instances.
default_quantile_config([quantiles]) The default configuration for QuantileEvaluation instances.
class gwgen.evaluation.EvaluationPreparation(*args, **kwargs)[source]

Bases: gwgen.evaluation.Evaluator

Evaluation task to prepare the evaluation

Methods

__reduce__() Reimplemented to give provide also the manager
download_src([force])
init_from_scratch() Initialize the setup via the parameterization classes
setup_from_db(*args, **kwargs)
setup_from_file(*args, **kwargs)
setup_from_scratch()
write2db(*args, **kwargs) Reimplemented to sort the data according to the index
write2file(*args, **kwargs) Reimplemented to sort the data according to the index

Attributes

datafile The paths to reference and input file
dbname list() -> new empty list
default_config
ghcnd_inventory_file
has_run bool(x) -> bool
http_inventory str(object=’‘) -> string
input_data The input DataFrame
name str(object=’‘) -> string
reference_data The reference DataFrame
station_list
summary str(object=’‘) -> string
__reduce__()[source]

Reimplemented to give provide also the manager

datafile

The paths to reference and input file

dbname = ['reference', 'input']
default_config
download_src(force=False)[source]
ghcnd_inventory_file
has_run = False
http_inventory = 'ftp://ftp.ncdc.noaa.gov/pub/data/ghcn/daily/ghcnd-inventory.txt'
init_from_scratch()[source]

Initialize the setup via the parameterization classes

input_data

The input DataFrame

name = 'prepare'
reference_data

The reference DataFrame

setup_from_db(*args, **kwargs)[source]
setup_from_file(*args, **kwargs)[source]
setup_from_scratch()[source]
station_list
summary = 'Prepare the for experiment for evaluation'
write2db(*args, **kwargs)[source]

Reimplemented to sort the data according to the index

write2file(*args, **kwargs)[source]

Reimplemented to sort the data according to the index

class gwgen.evaluation.Evaluator(stations, config, project_config, global_config, data=None, requirements=None, *args, **kwargs)[source]

Bases: gwgen.utils.TaskBase

Abstract base class for evaluation tasks

Evaluation tasks should incorporate a run method that is called by the gwgen.main.GWGENOrganizer.evaluate() method

Parameters:
  • stations (list) – The list of stations to process
  • config (dict) – The configuration of the experiment
  • project_config (dict) – The configuration of the underlying project
  • global_config (dict) – The global configuration
  • data (pandas.DataFrame) – The data to use. If None, use the setup() method
  • requirements (list of TaskBase instances) – The required instances. If None, you must call the set_requirements() method later
Other Parameters:
 

``*args, **kwargs`` – The configuration of the task. See the TaskConfig for arguments. Note that if you provide *args, you have to provide all possible arguments

Attributes

task_data_dir The directory where to store data
task_data_dir

The directory where to store data

class gwgen.evaluation.KSEvaluation(*args, **kwargs)[source]

Bases: gwgen.evaluation.QuantileEvaluation

Evaluation using a Kolmogorov-Smirnoff test

Methods

calc(group)
plot_map()
run(info) Run the evaluation
significance_fractions(series) The percentage of stations with no significant difference

Attributes

dbname str(object=’‘) -> string
default_config
name str(object=’‘) -> string
output output parameterization instance
prepare prepare parameterization instance
requires list() -> new empty list
summary str(object=’‘) -> string
static calc(group)[source]
dbname = 'kolmogorov_evaluation'
default_config
name = 'ks'
output

output parameterization instance

plot_map()[source]
prepare

prepare parameterization instance

requires = ['prepare', 'output']
run(info)[source]

Run the evaluation

Parameters:info (dict) – The configuration dictionary
significance_fractions(series)[source]

The percentage of stations with no significant difference

summary = 'Perform a kolmogorov smirnoff test'
class gwgen.evaluation.OutputTask(stations, config, project_config, global_config, data=None, requirements=None, *args, **kwargs)[source]

Bases: gwgen.evaluation.Evaluator

Task to provide all the data for input and output

Parameters:
  • stations (list) – The list of stations to process
  • config (dict) – The configuration of the experiment
  • project_config (dict) – The configuration of the underlying project
  • global_config (dict) – The global configuration
  • data (pandas.DataFrame) – The data to use. If None, use the setup() method
  • requirements (list of TaskBase instances) – The required instances. If None, you must call the set_requirements() method later
Other Parameters:
 

``*args, **kwargs`` – The configuration of the task. See the TaskConfig for arguments. Note that if you provide *args, you have to provide all possible arguments

Attributes

datafile
dbname str(object=’‘) -> string
has_run bool(x) -> bool
name str(object=’‘) -> string
summary str(object=’‘) -> string

Methods

setup_from_db(*args, **kwargs)
setup_from_file(*args, **kwargs)
setup_from_scratch()
write2file(*args, **kwargs) Not implemented since the output file is generated by the model!
datafile
dbname = 'output'
has_run = False
name = 'output'
setup_from_db(*args, **kwargs)[source]
setup_from_file(*args, **kwargs)[source]
setup_from_scratch()[source]
summary = 'Load the output of the model'
write2file(*args, **kwargs)[source]

Not implemented since the output file is generated by the model!

class gwgen.evaluation.QuantileEvaluation(*args, **kwargs)[source]

Bases: gwgen.evaluation.Evaluator

Evaluator to evaluate specific quantiles

Attributes

all_variables
dbname str(object=’‘) -> string
default_config
ds The dataset of the quantiles
fmt default formatoptions for the
has_run bool(x) -> bool
kwargs default formatoptions for the
name str(object=’‘) -> string
names Dictionary that remembers insertion order
output output parameterization instance
prepare prepare parameterization instance
setup_requires list() -> new empty list
summary str(object=’‘) -> string

Methods

calc(group)
create_project(ds)
make_run_config(sp, info)
round_to_ref_prec(ref, sim[, func]) Round one array to the precision of another
setup_from_db(*args, **kwargs)
setup_from_file(*args, **kwargs)
setup_from_scratch()
all_variables
calc(group)[source]
create_project(ds)[source]
dbname = 'quantile_evaluation'
default_config
ds

The dataset of the quantiles

fmt = {'cbar': '', 'yrange': (['minmax', 1], ['minmax', 99]), 'title': '%(pctl)sth percentile', 'xrange': (['minmax', 1], ['minmax', 99]), 'bounds': ['minmax', 11, 0, 99], 'sym_lims': 'max', 'ideal': [0, 1], 'cmap': 'w_Reds', 'xlabel': '%(type)s {desc}', 'legendlabels': ['$R^2$ = %(rsquared)s'], 'ylabel': '%(type)s {desc}', 'id_color': 'r', 'legend': {'loc': 'upper left'}, 'bins': 10}

default formatoptions for the psyplot.plotter.linreg.DensityRegPlotter plotter

has_run = True
kwargs = {'cbar': '', 'yrange': (['minmax', 1], ['minmax', 99]), 'title': '%(pctl)sth percentile', 'xrange': (['minmax', 1], ['minmax', 99]), 'bounds': ['minmax', 11, 0, 99], 'sym_lims': 'max', 'ideal': [0, 1], 'cmap': 'w_Reds', 'xlabel': '%(type)s {desc}', 'legendlabels': ['$R^2$ = %(rsquared)s'], 'ylabel': '%(type)s {desc}', 'id_color': 'r', 'legend': {'loc': 'upper left'}, 'bins': 10}

default formatoptions for the psyplot.plotter.linreg.DensityRegPlotter plotter

make_run_config(sp, info)[source]
name = 'quants'
names = OrderedDict([('prcp', {'units': 'mm', 'long_name': 'Precipitation'}), ('tmin', {'units': 'degC', 'long_name': 'Min. Temperature'}), ('tmax', {'units': 'degC', 'long_name': 'Max. Temperature'}), ('mean_cloud', {'units': '-', 'long_name': 'Cloud fraction'}), ('wind', {'units': 'm/s', 'long_name': 'Wind Speed'})])
output

output parameterization instance

prepare

prepare parameterization instance

static round_to_ref_prec(ref, sim, func=<ufunc 'ceil'>)[source]

Round one array to the precision of another

Parameters:
  • ref (np.ndarray) – The reference array to get the precision from
  • sim (np.ndarray) – The simulated array to round
  • func (function) – The rounding function to use
Returns:

Rounded sim

Return type:

np.ndarray

setup_from_db(*args, **kwargs)[source]
setup_from_file(*args, **kwargs)[source]
setup_from_scratch()[source]
setup_requires = ['prepare', 'output']
summary = 'Compare the quantiles of simulation and observation'
class gwgen.evaluation.SimulationQuality(stations, config, project_config, global_config, data=None, requirements=None, *args, **kwargs)[source]

Bases: gwgen.evaluation.Evaluator

Evaluator to provide one value characterizing the quality of the experiment

The applied metric is the mean of

\[\begin{split}m = \left<\{\left<\{R^2_q\}_{q\in Q}\right>, \left<\{1 - |1 - a_q|\}_{q\in Q}\right>, \{ks\}\}\right>\end{split}\]

Attributes

default_config
has_run bool(x) -> bool
name str(object=’‘) -> string
summary str(object=’‘) -> string

Methods

run(info)
setup_from_scratch() Only sets an empty dataframe

where \(\left<\right>\) denotes the mean of the enclosed set, \(q\in Q\) are the quantiles from the quantile evaluation, \(R^2_q\) the corresponding coefficient of determination and \(a_q\) the slope of quantile \(q\). \(ks\) denotes the fraction of stations that do not differ significantly from the observations according to the ks test.

In other words, this quality estimate is the mean of the

  1. coefficients of determination
  2. the deviation from the ideal slope (\(a_q == 1\)) and
  3. the fraction of stations that do not differ significantly

Hence, a value of 1 mean high quality, a value of 0 low quality

Parameters:
  • stations (list) – The list of stations to process
  • config (dict) – The configuration of the experiment
  • project_config (dict) – The configuration of the underlying project
  • global_config (dict) – The global configuration
  • data (pandas.DataFrame) – The data to use. If None, use the setup() method
  • requirements (list of TaskBase instances) – The required instances. If None, you must call the set_requirements() method later
Other Parameters:
 

``*args, **kwargs`` – The configuration of the task. See the TaskConfig for arguments. Note that if you provide *args, you have to provide all possible arguments

default_config
has_run = True
name = 'quality'
run(info)[source]
setup_from_scratch()[source]

Only sets an empty dataframe

summary = 'Estimate simulation quality using ks and quantile evaluation'
gwgen.evaluation.default_ks_config(no_rounding=False, names=None, transform_wind=False, *args, **kwargs)[source]

The default configuration for KSEvaluation instances. See also the KSEvaluation.default_config attribute

Parameters:
  • no_rounding (bool) – Do not round the simulation to the infered precision of the reference. The inferred precision is the minimum difference between two values with in the entire data
  • names (list of str) – The list of variables use for calculation. If None, all variables will be used
  • transform_wind (bool) – If True, the square root of the wind is evaluated (as this is also simulated in the weather generator)
  • setup_from ({ 'scratch' | 'file' | 'db' | None }) –

    The method how to setup the instance either from

    'scratch'
    To set up the task from the raw data
    'file'
    Set up the task from an existing file
    'db'
    Set up the task from a database
    None
    If the file name of this this task exists, use this one, otherwise a database is provided, use this one, otherwise go from scratch
  • to_csv (bool) – If True, the data at setup will be written to a csv file
  • to_db (bool) – If True, the data at setup will be written to into a database
  • remove (bool) – If True and the old data file already exists, remove before writing to it
  • skip_filtering (bool) – If True, skip the filtering for the correct stations in the datafile
  • plot_output (str) – An alternative path to use for the PDF file of the plot
  • nc_output (str) – An alternative path (or multiples depending on the task) to use for the netCDF file of the plot data
  • project_output (str) – An alternative path to use for the psyplot project file of the plot
  • new_project (bool) – If True, a new project will be created even if a file in project_output exists already
  • project (str) – The path to a psyplot project file to use for this parameterization
  • close (bool) – Close the project at the end
gwgen.evaluation.default_preparation_config(setup_raw=None, raw2db=False, raw2csv=False, reference=None, input_path=None, *args, **kwargs)[source]

The default configuration for EvaluationPreparation instances. See also the EvaluationPreparation.default_config attribute

Parameters:
  • setup_raw ({ 'scratch' | 'file' | 'db' | None }) –

    The method how to setup the raw data from GHCN and EECRA

    'scratch'
    To set up the task from the raw data
    'file'
    Set up the task from an existing file
    'db'
    Set up the task from a database
    None
    If the file name of this this task exists, use this one, otherwise a database is provided, use this one, otherwise go from scratch
  • raw2db (bool) – If True, the raw data from GHCN and EECRA is stored in a postgres database
  • raw2csv (bool) – If True, the raw data from GHCN and EECRA is stored in a csv file
  • reference (str) – The path of the file where to store the reference data. If None and not already set in the configuration, it will default to 'evaluation/reference.csv'
  • input_path (str) – The path of the file where to store the model input. If None, and not already set in the configuration, it will default to 'inputdir/input.csv' where inputdir is the path to the input directory (by default, input in the experiment directory)
  • setup_from ({ 'scratch' | 'file' | 'db' | None }) –

    The method how to setup the instance either from

    'scratch'
    To set up the task from the raw data
    'file'
    Set up the task from an existing file
    'db'
    Set up the task from a database
    None
    If the file name of this this task exists, use this one, otherwise a database is provided, use this one, otherwise go from scratch
  • to_csv (bool) – If True, the data at setup will be written to a csv file
  • to_db (bool) – If True, the data at setup will be written to into a database
  • remove (bool) – If True and the old data file already exists, remove before writing to it
  • skip_filtering (bool) – If True, skip the filtering for the correct stations in the datafile
  • plot_output (str) – An alternative path to use for the PDF file of the plot
  • nc_output (str) – An alternative path (or multiples depending on the task) to use for the netCDF file of the plot data
  • project_output (str) – An alternative path to use for the psyplot project file of the plot
  • new_project (bool) – If True, a new project will be created even if a file in project_output exists already
  • project (str) – The path to a psyplot project file to use for this parameterization
  • close (bool) – Close the project at the end
gwgen.evaluation.default_quality_config(quantiles=None, *args, **kwargs)[source]

The default configuration for SimulationQuality instances. See also the SimulationQuality.default_config attribute

Parameters:
  • quantiles (list of floats) – The quantiles to use for the quality analysis
  • setup_from ({ 'scratch' | 'file' | 'db' | None }) –

    The method how to setup the instance either from

    'scratch'
    To set up the task from the raw data
    'file'
    Set up the task from an existing file
    'db'
    Set up the task from a database
    None
    If the file name of this this task exists, use this one, otherwise a database is provided, use this one, otherwise go from scratch
  • to_csv (bool) – If True, the data at setup will be written to a csv file
  • to_db (bool) – If True, the data at setup will be written to into a database
  • remove (bool) – If True and the old data file already exists, remove before writing to it
  • skip_filtering (bool) – If True, skip the filtering for the correct stations in the datafile
  • plot_output (str) – An alternative path to use for the PDF file of the plot
  • nc_output (str) – An alternative path (or multiples depending on the task) to use for the netCDF file of the plot data
  • project_output (str) – An alternative path to use for the psyplot project file of the plot
  • new_project (bool) – If True, a new project will be created even if a file in project_output exists already
  • project (str) – The path to a psyplot project file to use for this parameterization
  • close (bool) – Close the project at the end
gwgen.evaluation.default_quantile_config(quantiles=[1, 5, 10, 25, 50, 75, 90, 95, 99, 100], *args, **kwargs)[source]

The default configuration for QuantileEvaluation instances. See also the QuantileEvaluation.default_config attribute

Parameters:
  • no_rounding (bool) – Do not round the simulation to the infered precision of the reference. The inferred precision is the minimum difference between two values with in the entire data
  • names (list of str) – The list of variables use for calculation. If None, all variables will be used
  • transform_wind (bool) – If True, the square root of the wind is evaluated (as this is also simulated in the weather generator)
  • quantiles (list of floats) – The quantiles to use for calculating the percentiles
  • setup_from ({ 'scratch' | 'file' | 'db' | None }) –

    The method how to setup the instance either from

    'scratch'
    To set up the task from the raw data
    'file'
    Set up the task from an existing file
    'db'
    Set up the task from a database
    None
    If the file name of this this task exists, use this one, otherwise a database is provided, use this one, otherwise go from scratch
  • to_csv (bool) – If True, the data at setup will be written to a csv file
  • to_db (bool) – If True, the data at setup will be written to into a database
  • remove (bool) – If True and the old data file already exists, remove before writing to it
  • skip_filtering (bool) – If True, skip the filtering for the correct stations in the datafile
  • plot_output (str) – An alternative path to use for the PDF file of the plot
  • nc_output (str) – An alternative path (or multiples depending on the task) to use for the netCDF file of the plot data
  • project_output (str) – An alternative path to use for the psyplot project file of the plot
  • new_project (bool) – If True, a new project will be created even if a file in project_output exists already
  • project (str) – The path to a psyplot project file to use for this parameterization
  • close (bool) – Close the project at the end