gwgen.preproc module¶
Additional routines for preprocessing
Classes
CloudGHCNMap (*args, **kwargs) |
A task for computing the EECRA inventory for each station | ||
CloudInventory (*args, **kwargs) |
A task for computing the EECRA inventory for each station | ||
CloudPreproc (stations, config, ...[, data, ...]) |
|
Functions
default_cloud_ghcn_map_config ([max_distance]) |
Default config for CloudGHCNMap |
default_cloud_inventory_config ([xstall]) |
Default config for CloudInventory |
-
class
gwgen.preproc.
CloudGHCNMap
(*args, **kwargs)[source]¶ Bases:
gwgen.preproc.CloudPreproc
A task for computing the EECRA inventory for each station
Attributes
dbname
str(object=’‘) -> string default_config
eecra_inventory
eecra_inventory parameterization instance has_run
bool(x) -> bool name
str(object=’‘) -> string setup_requires
list() -> new empty list summary
str(object=’‘) -> string Methods
init_from_scratch
()run
(info)setup
(*args, **kwargs)setup_from_scratch
()Does nothing but initializing an empty data frame. write2db
(*args, **kwargs)write2file
(*args, **kwargs)-
dbname
= 'eecra_ghcn_map'¶
-
default_config
¶
-
eecra_inventory
¶ eecra_inventory parameterization instance
-
has_run
= True¶
-
name
= 'eecra_ghcn_map'¶
-
setup_from_scratch
()[source]¶ Does nothing but initializing an empty data frame. The real work is done in the
run()
method
-
setup_requires
= ['eecra_inventory']¶
-
summary
= 'Compute the inventory of the EECRA stations'¶
-
-
class
gwgen.preproc.
CloudInventory
(*args, **kwargs)[source]¶ Bases:
gwgen.preproc.CloudPreproc
A task for computing the EECRA inventory for each station
Attributes
dbname
str(object=’‘) -> string default_config
has_run
bool(x) -> bool http_xstall
str(object=’‘) -> string name
str(object=’‘) -> string setup_parallel
summary
str(object=’‘) -> string xstall_df
The dataframe corresponding to the XSTALL stations Methods
init_from_scratch
()run
(info)setup
(*args, **kwargs)setup_from_db
(**kwargs)Set up the task from datatables already created (and avoid locating setup_from_file
(**kwargs)Set up the task from already stored files (and avoid locating the setup_from_scratch
()write2db
(*args, **kwargs)write2file
(*args, **kwargs)-
dbname
= 'eecra_inventory'¶
-
default_config
¶
-
has_run
= True¶
-
http_xstall
= 'http://cdiac.ornl.gov/ftp/ndp026c/XSTALL'¶
-
name
= 'eecra_inventory'¶
-
setup_from_db
(**kwargs)[source]¶ Set up the task from datatables already created (and avoid locating the stations of this task)
-
setup_from_file
(**kwargs)[source]¶ Set up the task from already stored files (and avoid locating the stations of this task)
-
setup_parallel
¶
-
summary
= 'Compute the inventory of the EECRA stations'¶
-
xstall_df
¶ The dataframe corresponding to the XSTALL stations
-
-
class
gwgen.preproc.
CloudPreproc
(stations, config, project_config, global_config, data=None, requirements=None, *args, **kwargs)[source]¶ Bases:
gwgen.utils.TaskBase
Parameters: - stations (list) – The list of stations to process
- config (dict) – The configuration of the experiment
- project_config (dict) – The configuration of the underlying project
- global_config (dict) – The global configuration
- data (pandas.DataFrame) – The data to use. If None, use the
setup()
method - requirements (list of
TaskBase
instances) – The required instances. If None, you must call theset_requirements()
method later
Other Parameters: ``*args, **kwargs`` – The configuration of the task. See the
TaskConfig
for arguments. Note that if you provide*args
, you have to provide all possible argumentsAttributes
task_data_dir
-
task_data_dir
¶
-
gwgen.preproc.
default_cloud_ghcn_map_config
(max_distance=1000.0, *args, **kwargs)[source]¶ Default config for
CloudGHCNMap
Parameters: - max_distance (float) – The maximum distance in meters for which we consider two stations as equal (Default: 1000m)
- to_csv (bool) – If True, the data at setup will be written to a csv file
- remove (bool) – If True and the old data file already exists, remove before writing to it
- skip_filtering (bool) – If True, skip the filtering for the correct stations in the datafile
- plot_output (str) – An alternative path to use for the PDF file of the plot
- nc_output (str) – An alternative path (or multiples depending on the task) to use for the netCDF file of the plot data
- project_output (str) – An alternative path to use for the psyplot project file of the plot
- new_project (bool) – If True, a new project will be created even if a file in project_output exists already
- project (str) – The path to a psyplot project file to use for this parameterization
- close (bool) – Close the project at the end
-
gwgen.preproc.
default_cloud_inventory_config
(xstall=True, *args, **kwargs)[source]¶ Default config for
CloudInventory
Parameters: - xstall (bool or str) – If True (default), download the XSTALL file from http://cdiac.ornl.gov/ftp/ndp026c/XSTALL.
This file contains some estimates of station longitude and latitude.
If
False
or empty string, the file is not used, otherwise, if set with a string, it is interpreted as the path to the local file - to_csv (bool) – If True, the data at setup will be written to a csv file
- remove (bool) – If True and the old data file already exists, remove before writing to it
- skip_filtering (bool) – If True, skip the filtering for the correct stations in the datafile
- plot_output (str) – An alternative path to use for the PDF file of the plot
- nc_output (str) – An alternative path (or multiples depending on the task) to use for the netCDF file of the plot data
- project_output (str) – An alternative path to use for the psyplot project file of the plot
- new_project (bool) – If True, a new project will be created even if a file in project_output exists already
- project (str) – The path to a psyplot project file to use for this parameterization
- close (bool) – Close the project at the end
- xstall (bool or str) – If True (default), download the XSTALL file from http://cdiac.ornl.gov/ftp/ndp026c/XSTALL.
This file contains some estimates of station longitude and latitude.
If