gwgen.preproc module

Additional routines for preprocessing

Classes

CloudGHCNMap(*args, **kwargs) A task for computing the EECRA inventory for each station
CloudInventory(*args, **kwargs) A task for computing the EECRA inventory for each station
CloudPreproc(stations, config, ...[, data, ...])
param stations:The list of stations to process

Functions

default_cloud_ghcn_map_config([max_distance]) Default config for CloudGHCNMap
default_cloud_inventory_config([xstall]) Default config for CloudInventory
class gwgen.preproc.CloudGHCNMap(*args, **kwargs)[source]

Bases: gwgen.preproc.CloudPreproc

A task for computing the EECRA inventory for each station

Attributes

dbname str(object=’‘) -> string
default_config
eecra_inventory eecra_inventory parameterization instance
has_run bool(x) -> bool
name str(object=’‘) -> string
setup_requires list() -> new empty list
summary str(object=’‘) -> string

Methods

init_from_scratch()
run(info)
setup(*args, **kwargs)
setup_from_scratch() Does nothing but initializing an empty data frame.
write2db(*args, **kwargs)
write2file(*args, **kwargs)
dbname = 'eecra_ghcn_map'
default_config
eecra_inventory

eecra_inventory parameterization instance

has_run = True
init_from_scratch()[source]
name = 'eecra_ghcn_map'
run(info)[source]
setup(*args, **kwargs)[source]
setup_from_scratch()[source]

Does nothing but initializing an empty data frame. The real work is done in the run() method

setup_requires = ['eecra_inventory']
summary = 'Compute the inventory of the EECRA stations'
write2db(*args, **kwargs)[source]
write2file(*args, **kwargs)[source]
class gwgen.preproc.CloudInventory(*args, **kwargs)[source]

Bases: gwgen.preproc.CloudPreproc

A task for computing the EECRA inventory for each station

Attributes

dbname str(object=’‘) -> string
default_config
has_run bool(x) -> bool
http_xstall str(object=’‘) -> string
name str(object=’‘) -> string
setup_parallel
summary str(object=’‘) -> string
xstall_df The dataframe corresponding to the XSTALL stations

Methods

init_from_scratch()
run(info)
setup(*args, **kwargs)
setup_from_db(**kwargs) Set up the task from datatables already created (and avoid locating
setup_from_file(**kwargs) Set up the task from already stored files (and avoid locating the
setup_from_scratch()
write2db(*args, **kwargs)
write2file(*args, **kwargs)
dbname = 'eecra_inventory'
default_config
has_run = True
http_xstall = 'http://cdiac.ornl.gov/ftp/ndp026c/XSTALL'
init_from_scratch()[source]
name = 'eecra_inventory'
run(info)[source]
setup(*args, **kwargs)[source]
setup_from_db(**kwargs)[source]

Set up the task from datatables already created (and avoid locating the stations of this task)

setup_from_file(**kwargs)[source]

Set up the task from already stored files (and avoid locating the stations of this task)

setup_from_scratch()[source]
setup_parallel
summary = 'Compute the inventory of the EECRA stations'
write2db(*args, **kwargs)[source]
write2file(*args, **kwargs)[source]
xstall_df

The dataframe corresponding to the XSTALL stations

class gwgen.preproc.CloudPreproc(stations, config, project_config, global_config, data=None, requirements=None, *args, **kwargs)[source]

Bases: gwgen.utils.TaskBase

Parameters:
  • stations (list) – The list of stations to process
  • config (dict) – The configuration of the experiment
  • project_config (dict) – The configuration of the underlying project
  • global_config (dict) – The global configuration
  • data (pandas.DataFrame) – The data to use. If None, use the setup() method
  • requirements (list of TaskBase instances) – The required instances. If None, you must call the set_requirements() method later
Other Parameters:
 

``*args, **kwargs`` – The configuration of the task. See the TaskConfig for arguments. Note that if you provide *args, you have to provide all possible arguments

Attributes

task_data_dir
task_data_dir
gwgen.preproc.default_cloud_ghcn_map_config(max_distance=1000.0, *args, **kwargs)[source]

Default config for CloudGHCNMap

Parameters:
  • max_distance (float) – The maximum distance in meters for which we consider two stations as equal (Default: 1000m)
  • to_csv (bool) – If True, the data at setup will be written to a csv file
  • remove (bool) – If True and the old data file already exists, remove before writing to it
  • skip_filtering (bool) – If True, skip the filtering for the correct stations in the datafile
  • plot_output (str) – An alternative path to use for the PDF file of the plot
  • nc_output (str) – An alternative path (or multiples depending on the task) to use for the netCDF file of the plot data
  • project_output (str) – An alternative path to use for the psyplot project file of the plot
  • new_project (bool) – If True, a new project will be created even if a file in project_output exists already
  • project (str) – The path to a psyplot project file to use for this parameterization
  • close (bool) – Close the project at the end
gwgen.preproc.default_cloud_inventory_config(xstall=True, *args, **kwargs)[source]

Default config for CloudInventory

Parameters:
  • xstall (bool or str) – If True (default), download the XSTALL file from http://cdiac.ornl.gov/ftp/ndp026c/XSTALL. This file contains some estimates of station longitude and latitude. If False or empty string, the file is not used, otherwise, if set with a string, it is interpreted as the path to the local file
  • to_csv (bool) – If True, the data at setup will be written to a csv file
  • remove (bool) – If True and the old data file already exists, remove before writing to it
  • skip_filtering (bool) – If True, skip the filtering for the correct stations in the datafile
  • plot_output (str) – An alternative path to use for the PDF file of the plot
  • nc_output (str) – An alternative path (or multiples depending on the task) to use for the netCDF file of the plot data
  • project_output (str) – An alternative path to use for the psyplot project file of the plot
  • new_project (bool) – If True, a new project will be created even if a file in project_output exists already
  • project (str) – The path to a psyplot project file to use for this parameterization
  • close (bool) – Close the project at the end