gwgen.evaluation module¶
Evaluation module of the gwgen module
Classes
EvaluationPreparation(*args, **kwargs) |
Evaluation task to prepare the evaluation |
Evaluator(stations, config, project_config, ...) |
Abstract base class for evaluation tasks |
KSEvaluation(*args, **kwargs) |
Evaluation using a Kolmogorov-Smirnoff test |
OutputTask(stations, config, project_config, ...) |
Task to provide all the data for input and output |
QuantileEvaluation(*args, **kwargs) |
Evaluator to evaluate specific quantiles |
SimulationQuality(stations, config, ...[, ...]) |
Evaluator to provide one value characterizing the quality of the |
Functions
default_ks_config([no_rounding, names, ...]) |
The default configuration for KSEvaluation instances. |
default_preparation_config([setup_raw, ...]) |
The default configuration for EvaluationPreparation instances. |
default_quality_config([quantiles]) |
The default configuration for SimulationQuality instances. |
default_quantile_config([quantiles]) |
The default configuration for QuantileEvaluation instances. |
-
class
gwgen.evaluation.EvaluationPreparation(*args, **kwargs)[source]¶ Bases:
gwgen.evaluation.EvaluatorEvaluation task to prepare the evaluation
Methods
__reduce__()Reimplemented to give provide also the manager download_src([force])init_from_scratch()Initialize the setup via the parameterization classes setup_from_db(*args, **kwargs)setup_from_file(*args, **kwargs)setup_from_scratch()write2db(*args, **kwargs)Reimplemented to sort the data according to the index write2file(*args, **kwargs)Reimplemented to sort the data according to the index Attributes
datafileThe paths to reference and input file dbnamelist() -> new empty list default_configghcnd_inventory_filehas_runbool(x) -> bool http_inventorystr(object=’‘) -> string input_dataThe input DataFramenamestr(object=’‘) -> string reference_dataThe reference DataFramestation_listsummarystr(object=’‘) -> string -
datafile¶ The paths to reference and input file
-
dbname= ['reference', 'input']¶
-
default_config¶
-
ghcnd_inventory_file¶
-
has_run= False¶
-
http_inventory= 'ftp://ftp.ncdc.noaa.gov/pub/data/ghcn/daily/ghcnd-inventory.txt'¶
-
name= 'prepare'¶
-
station_list¶
-
summary= 'Prepare the for experiment for evaluation'¶
-
-
class
gwgen.evaluation.Evaluator(stations, config, project_config, global_config, data=None, requirements=None, *args, **kwargs)[source]¶ Bases:
gwgen.utils.TaskBaseAbstract base class for evaluation tasks
Evaluation tasks should incorporate a run method that is called by the
gwgen.main.GWGENOrganizer.evaluate()methodParameters: - stations (list) – The list of stations to process
- config (dict) – The configuration of the experiment
- project_config (dict) – The configuration of the underlying project
- global_config (dict) – The global configuration
- data (pandas.DataFrame) – The data to use. If None, use the
setup()method - requirements (list of
TaskBaseinstances) – The required instances. If None, you must call theset_requirements()method later
Other Parameters: ``*args, **kwargs`` – The configuration of the task. See the
TaskConfigfor arguments. Note that if you provide*args, you have to provide all possible argumentsAttributes
task_data_dirThe directory where to store data -
task_data_dir¶ The directory where to store data
-
class
gwgen.evaluation.KSEvaluation(*args, **kwargs)[source]¶ Bases:
gwgen.evaluation.QuantileEvaluationEvaluation using a Kolmogorov-Smirnoff test
Methods
calc(group)plot_map()run(info)Run the evaluation significance_fractions(series)The percentage of stations with no significant difference Attributes
dbnamestr(object=’‘) -> string default_confignamestr(object=’‘) -> string outputoutput parameterization instance prepareprepare parameterization instance requireslist() -> new empty list summarystr(object=’‘) -> string -
dbname= 'kolmogorov_evaluation'¶
-
default_config¶
-
name= 'ks'¶
-
output¶ output parameterization instance
-
prepare¶ prepare parameterization instance
-
requires= ['prepare', 'output']¶
-
summary= 'Perform a kolmogorov smirnoff test'¶
-
-
class
gwgen.evaluation.OutputTask(stations, config, project_config, global_config, data=None, requirements=None, *args, **kwargs)[source]¶ Bases:
gwgen.evaluation.EvaluatorTask to provide all the data for input and output
Parameters: - stations (list) – The list of stations to process
- config (dict) – The configuration of the experiment
- project_config (dict) – The configuration of the underlying project
- global_config (dict) – The global configuration
- data (pandas.DataFrame) – The data to use. If None, use the
setup()method - requirements (list of
TaskBaseinstances) – The required instances. If None, you must call theset_requirements()method later
Other Parameters: ``*args, **kwargs`` – The configuration of the task. See the
TaskConfigfor arguments. Note that if you provide*args, you have to provide all possible argumentsAttributes
datafiledbnamestr(object=’‘) -> string has_runbool(x) -> bool namestr(object=’‘) -> string summarystr(object=’‘) -> string Methods
setup_from_db(*args, **kwargs)setup_from_file(*args, **kwargs)setup_from_scratch()write2file(*args, **kwargs)Not implemented since the output file is generated by the model! -
datafile¶
-
dbname= 'output'¶
-
has_run= False¶
-
name= 'output'¶
-
summary= 'Load the output of the model'¶
-
class
gwgen.evaluation.QuantileEvaluation(*args, **kwargs)[source]¶ Bases:
gwgen.evaluation.EvaluatorEvaluator to evaluate specific quantiles
Attributes
all_variablesdbnamestr(object=’‘) -> string default_configdsThe dataset of the quantiles fmtdefault formatoptions for the has_runbool(x) -> bool kwargsdefault formatoptions for the namestr(object=’‘) -> string namesDictionary that remembers insertion order outputoutput parameterization instance prepareprepare parameterization instance setup_requireslist() -> new empty list summarystr(object=’‘) -> string Methods
calc(group)create_project(ds)make_run_config(sp, info)round_to_ref_prec(ref, sim[, func])Round one array to the precision of another setup_from_db(*args, **kwargs)setup_from_file(*args, **kwargs)setup_from_scratch()-
all_variables¶
-
dbname= 'quantile_evaluation'¶
-
default_config¶
-
ds¶ The dataset of the quantiles
-
fmt= {'cbar': '', 'yrange': (['minmax', 1], ['minmax', 99]), 'title': '%(pctl)sth percentile', 'xrange': (['minmax', 1], ['minmax', 99]), 'bounds': ['minmax', 11, 0, 99], 'sym_lims': 'max', 'ideal': [0, 1], 'cmap': 'w_Reds', 'xlabel': '%(type)s {desc}', 'legendlabels': ['$R^2$ = %(rsquared)s'], 'ylabel': '%(type)s {desc}', 'id_color': 'r', 'legend': {'loc': 'upper left'}, 'bins': 10}¶ default formatoptions for the
psyplot.plotter.linreg.DensityRegPlotterplotter
-
has_run= True¶
-
kwargs= {'cbar': '', 'yrange': (['minmax', 1], ['minmax', 99]), 'title': '%(pctl)sth percentile', 'xrange': (['minmax', 1], ['minmax', 99]), 'bounds': ['minmax', 11, 0, 99], 'sym_lims': 'max', 'ideal': [0, 1], 'cmap': 'w_Reds', 'xlabel': '%(type)s {desc}', 'legendlabels': ['$R^2$ = %(rsquared)s'], 'ylabel': '%(type)s {desc}', 'id_color': 'r', 'legend': {'loc': 'upper left'}, 'bins': 10}¶ default formatoptions for the
psyplot.plotter.linreg.DensityRegPlotterplotter
-
name= 'quants'¶
-
names= OrderedDict([('prcp', {'units': 'mm', 'long_name': 'Precipitation'}), ('tmin', {'units': 'degC', 'long_name': 'Min. Temperature'}), ('tmax', {'units': 'degC', 'long_name': 'Max. Temperature'}), ('mean_cloud', {'units': '-', 'long_name': 'Cloud fraction'}), ('wind', {'units': 'm/s', 'long_name': 'Wind Speed'})])¶
-
output¶ output parameterization instance
-
prepare¶ prepare parameterization instance
-
static
round_to_ref_prec(ref, sim, func=<ufunc 'ceil'>)[source]¶ Round one array to the precision of another
Parameters: - ref (np.ndarray) – The reference array to get the precision from
- sim (np.ndarray) – The simulated array to round
- func (function) – The rounding function to use
Returns: Rounded sim
Return type: np.ndarray
-
setup_requires= ['prepare', 'output']¶
-
summary= 'Compare the quantiles of simulation and observation'¶
-
-
class
gwgen.evaluation.SimulationQuality(stations, config, project_config, global_config, data=None, requirements=None, *args, **kwargs)[source]¶ Bases:
gwgen.evaluation.EvaluatorEvaluator to provide one value characterizing the quality of the experiment
The applied metric is the mean of
\[\begin{split}m = \left<\{\left<\{R^2_q\}_{q\in Q}\right>, \left<\{1 - |1 - a_q|\}_{q\in Q}\right>, \{ks\}\}\right>\end{split}\]Attributes
default_confighas_runbool(x) -> bool namestr(object=’‘) -> string summarystr(object=’‘) -> string Methods
run(info)setup_from_scratch()Only sets an empty dataframe where \(\left<\right>\) denotes the mean of the enclosed set, \(q\in Q\) are the quantiles from the quantile evaluation, \(R^2_q\) the corresponding coefficient of determination and \(a_q\) the slope of quantile \(q\). \(ks\) denotes the fraction of stations that do not differ significantly from the observations according to the ks test.
In other words, this quality estimate is the mean of the
- coefficients of determination
- the deviation from the ideal slope (\(a_q == 1\)) and
- the fraction of stations that do not differ significantly
Hence, a value of 1 mean high quality, a value of 0 low quality
Parameters: - stations (list) – The list of stations to process
- config (dict) – The configuration of the experiment
- project_config (dict) – The configuration of the underlying project
- global_config (dict) – The global configuration
- data (pandas.DataFrame) – The data to use. If None, use the
setup()method - requirements (list of
TaskBaseinstances) – The required instances. If None, you must call theset_requirements()method later
Other Parameters: ``*args, **kwargs`` – The configuration of the task. See the
TaskConfigfor arguments. Note that if you provide*args, you have to provide all possible arguments-
default_config¶
-
has_run= True¶
-
name= 'quality'¶
-
summary= 'Estimate simulation quality using ks and quantile evaluation'¶
-
gwgen.evaluation.default_ks_config(no_rounding=False, names=None, transform_wind=False, *args, **kwargs)[source]¶ The default configuration for
KSEvaluationinstances. See also theKSEvaluation.default_configattributeParameters: - no_rounding (bool) – Do not round the simulation to the infered precision of the reference. The inferred precision is the minimum difference between two values with in the entire data
- names (list of str) – The list of variables use for calculation. If None, all variables will be used
- transform_wind (bool) – If True, the square root of the wind is evaluated (as this is also simulated in the weather generator)
- setup_from ({ 'scratch' | 'file' | 'db' | None }) –
The method how to setup the instance either from
'scratch'- To set up the task from the raw data
'file'- Set up the task from an existing file
'db'- Set up the task from a database
None- If the file name of this this task exists, use this one, otherwise a database is provided, use this one, otherwise go from scratch
- to_csv (bool) – If True, the data at setup will be written to a csv file
- to_db (bool) – If True, the data at setup will be written to into a database
- remove (bool) – If True and the old data file already exists, remove before writing to it
- skip_filtering (bool) – If True, skip the filtering for the correct stations in the datafile
- plot_output (str) – An alternative path to use for the PDF file of the plot
- nc_output (str) – An alternative path (or multiples depending on the task) to use for the netCDF file of the plot data
- project_output (str) – An alternative path to use for the psyplot project file of the plot
- new_project (bool) – If True, a new project will be created even if a file in project_output exists already
- project (str) – The path to a psyplot project file to use for this parameterization
- close (bool) – Close the project at the end
-
gwgen.evaluation.default_preparation_config(setup_raw=None, raw2db=False, raw2csv=False, reference=None, input_path=None, *args, **kwargs)[source]¶ The default configuration for
EvaluationPreparationinstances. See also theEvaluationPreparation.default_configattributeParameters: - setup_raw ({ 'scratch' | 'file' | 'db' | None }) –
The method how to setup the raw data from GHCN and EECRA
'scratch'- To set up the task from the raw data
'file'- Set up the task from an existing file
'db'- Set up the task from a database
None- If the file name of this this task exists, use this one, otherwise a database is provided, use this one, otherwise go from scratch
- raw2db (bool) – If True, the raw data from GHCN and EECRA is stored in a postgres database
- raw2csv (bool) – If True, the raw data from GHCN and EECRA is stored in a csv file
- reference (str) – The path of the file where to store the reference data. If None and not
already set in the configuration, it will default to
'evaluation/reference.csv' - input_path (str) – The path of the file where to store the model input. If None, and not
already set in the configuration, it will default to
'inputdir/input.csv'where inputdir is the path to the input directory (by default, input in the experiment directory) - setup_from ({ 'scratch' | 'file' | 'db' | None }) –
The method how to setup the instance either from
'scratch'- To set up the task from the raw data
'file'- Set up the task from an existing file
'db'- Set up the task from a database
None- If the file name of this this task exists, use this one, otherwise a database is provided, use this one, otherwise go from scratch
- to_csv (bool) – If True, the data at setup will be written to a csv file
- to_db (bool) – If True, the data at setup will be written to into a database
- remove (bool) – If True and the old data file already exists, remove before writing to it
- skip_filtering (bool) – If True, skip the filtering for the correct stations in the datafile
- plot_output (str) – An alternative path to use for the PDF file of the plot
- nc_output (str) – An alternative path (or multiples depending on the task) to use for the netCDF file of the plot data
- project_output (str) – An alternative path to use for the psyplot project file of the plot
- new_project (bool) – If True, a new project will be created even if a file in project_output exists already
- project (str) – The path to a psyplot project file to use for this parameterization
- close (bool) – Close the project at the end
- setup_raw ({ 'scratch' | 'file' | 'db' | None }) –
-
gwgen.evaluation.default_quality_config(quantiles=None, *args, **kwargs)[source]¶ The default configuration for
SimulationQualityinstances. See also theSimulationQuality.default_configattributeParameters: - quantiles (list of floats) – The quantiles to use for the quality analysis
- setup_from ({ 'scratch' | 'file' | 'db' | None }) –
The method how to setup the instance either from
'scratch'- To set up the task from the raw data
'file'- Set up the task from an existing file
'db'- Set up the task from a database
None- If the file name of this this task exists, use this one, otherwise a database is provided, use this one, otherwise go from scratch
- to_csv (bool) – If True, the data at setup will be written to a csv file
- to_db (bool) – If True, the data at setup will be written to into a database
- remove (bool) – If True and the old data file already exists, remove before writing to it
- skip_filtering (bool) – If True, skip the filtering for the correct stations in the datafile
- plot_output (str) – An alternative path to use for the PDF file of the plot
- nc_output (str) – An alternative path (or multiples depending on the task) to use for the netCDF file of the plot data
- project_output (str) – An alternative path to use for the psyplot project file of the plot
- new_project (bool) – If True, a new project will be created even if a file in project_output exists already
- project (str) – The path to a psyplot project file to use for this parameterization
- close (bool) – Close the project at the end
-
gwgen.evaluation.default_quantile_config(quantiles=[1, 5, 10, 25, 50, 75, 90, 95, 99, 100], *args, **kwargs)[source]¶ The default configuration for
QuantileEvaluationinstances. See also theQuantileEvaluation.default_configattributeParameters: - no_rounding (bool) – Do not round the simulation to the infered precision of the reference. The inferred precision is the minimum difference between two values with in the entire data
- names (list of str) – The list of variables use for calculation. If None, all variables will be used
- transform_wind (bool) – If True, the square root of the wind is evaluated (as this is also simulated in the weather generator)
- quantiles (list of floats) – The quantiles to use for calculating the percentiles
- setup_from ({ 'scratch' | 'file' | 'db' | None }) –
The method how to setup the instance either from
'scratch'- To set up the task from the raw data
'file'- Set up the task from an existing file
'db'- Set up the task from a database
None- If the file name of this this task exists, use this one, otherwise a database is provided, use this one, otherwise go from scratch
- to_csv (bool) – If True, the data at setup will be written to a csv file
- to_db (bool) – If True, the data at setup will be written to into a database
- remove (bool) – If True and the old data file already exists, remove before writing to it
- skip_filtering (bool) – If True, skip the filtering for the correct stations in the datafile
- plot_output (str) – An alternative path to use for the PDF file of the plot
- nc_output (str) – An alternative path (or multiples depending on the task) to use for the netCDF file of the plot data
- project_output (str) – An alternative path to use for the psyplot project file of the plot
- new_project (bool) – If True, a new project will be created even if a file in project_output exists already
- project (str) – The path to a psyplot project file to use for this parameterization
- close (bool) – Close the project at the end