gwgen preprocΒΆ

Preprocess the data

usage: gwgen preproc [-h] {select,cloud,test} ...
Sub-commands:
select

Select stations based upon a regular grid

usage: gwgen preproc select [-h] [-g str] [-og str] [-os str] [-k str]
                            [-ok str] [-gdb str] [-sdb str] [-nc]
                            [-f { 'scratch' | 'file' | 'db' }]
                            [-d { 'single' | 'all' }]
Optional Arguments
-g, --grid The path to a csv-file containing a lat and a lon column with the information on the centers of the grid. If None, `igrid_key` must not be None and point to a key in the configuration (either the one of the experiment, or the project, or the global configuration) specifying the path
-og, --grid-output
 The path to the csv-file where to store the mapping from grid lat-lon to station id.
-os, --stations-output
 The path to the csv-file where to store the mapping from station to grid center point
-k, --igrid-key
 The key in the configuration where to store the path of the `grid` input file
-ok, --grid-key
 The key in the configuration where to store the name of the `grid_output` file.
-gdb, --grid-db
 The name of a data table to store the data of `stations_output` in
-sdb, --stations-db
 The name of a data table to store the data for `stations_output` in
-nc=False, --no-prcp-check=False
 If True, we will not check for the values between 0.1 and 1.0 for precipitation and save the result in the ``’best’`` column
-f, --from The setup method for the daily data for the prcp check
-d, --download

Handles how to manage missing files for the prcp check. If None (default), an warning is printed and the file is ignored, if ``’single’``, the missing file is downloaded, if ``’all’``, the entire tarball is downloaded (strongly not recommended for this function)

Possible choices: single, all

cloud

Extract the inventory of EECRA stations

usage: gwgen preproc cloud [-h] [-mf int] {eecra_inventory,eecra_ghcn_map} ...
Optional Arguments
-mf, --max-files
 The maximum number of files to process during one process. If None, it is determined by the global ``’max_stations’`` key
Sub-commands:
eecra_inventory

Default config for :class:`CloudInventory`

usage: gwgen preproc cloud eecra_inventory [-h] [-ido str]
                                           [-xstall bool or str]
                                           [-f { 'scratch' | 'file' | 'db' | None }]
                                           [-to-csv] [-to-db] [-rm] [-sf]
Optional Arguments
-ido, --other_id
 Use the configuration from another experiment
Setup arguments
-xstall=True If True (default), download the XSTALL file from http://cdiac.ornl.gov/ftp/ndp026c/XSTALL. This file contains some estimates of station longitude and latitude. If ``False`` or empty string, the file is not used, otherwise, if set with a string, it is interpreted as the path to the local file
-f, --from

The method how to setup the instance either from ``’scratch’`` To set up the task from the raw data ``’file’`` Set up the task from an existing file ``’db’`` Set up the task from a database ``None`` If the file name of this this task exists, use this one, otherwise a database is provided, use this one, otherwise go from scratch

Possible choices: scratch, file, db

-to-csv=False If True, the data at setup will be written to a csv file
-to-db=False If True, the data at setup will be written to into a database
-rm=False, --remove=False
 If True and the old data file already exists, remove before writing to it
-sf=False, --skip-filtering=False
 If True, skip the filtering for the correct stations in the datafile
eecra_ghcn_map

Default config for :class:`CloudGHCNMap`

usage: gwgen preproc cloud eecra_ghcn_map [-h] [-ido str] [-md float]
                                          [-to-csv] [-rm] [-sf]
Optional Arguments
-ido, --other_id
 Use the configuration from another experiment
Setup arguments
-md=1000.0, --max-distance=1000.0
 The maximum distance in meters for which we consider two stations as equal (Default: 1000m)
-to-csv=False If True, the data at setup will be written to a csv file
-rm=False, --remove=False
 If True and the old data file already exists, remove before writing to it
-sf=False, --skip-filtering=False
 If True, skip the filtering for the correct stations in the datafile
test

Create a test sample for the given GHCN stations

usage: gwgen preproc test [-h] [-nc] [-re float] [-a] str str or list of str
Required Arguments
str The path to the directory containing the test files from Github
str or list of str
 either a list of GHCN stations to use or a filename containing a 1-row table with GHCN stations
Optional Arguments
-nc=False, --no-cloud=False
 If True, no cloud stations are extracted
-re=0, --reduce-eecra=0
 The percentage by which to reduce the EECRA data
-a=False, --keep-all=False
 If True all years of the EECRA data are used. Otherwise, only the years with complete temperature and cloud are kept. Note that this has only an effect if `reduce_eecra` is not 0