Skip to content

module TopoPyScale.sim_fsm

Methods to generate required simulation files and run simulations of various models using tscale forcing J. Fiddes, February 2022

TODO:


function fsm_nlst

fsm_nlst(nconfig, metfile, nave)

Function to generate namelist parameter file that is required to run the FSM model. https://github.com/RichardEssery/FSM

Args:

  • nconfig (int): which FSm configuration to run (integer 1-31)
  • metfile (str): path to input tscale file (relative as Fortran fails with long strings (max 21 chars?))
  • nave (int): number of forcing steps to average output over eg if forcing is hourly and output required is daily then nave = 24

Returns: NULL (writes out namelist text file which configures a single FSM run)

Notes:

constraint is that Fortran fails with long strings (max?) definition: https://github.com/RichardEssery/FSM/blob/master/nlst_CdP_0506.txt

Example nlst: &config / &drive met_file = 'data/met_CdP_0506.txt' zT = 1.5 zvar = .FALSE. / &params / &initial Tsoil = 282.98 284.17 284.70 284.70 / &outputs out_file = 'out_CdP_0506.txt' /


function txt2ds

txt2ds(fname)

Function to read a single FSM text file output as a xarray dataset

Args:

  • fname (str): filename

Returns: xarray dataset of dimension (time, point_id)


function to_netcdf

to_netcdf(fname_fsm_sim, complevel=9)

Function to convert a single FSM simulation output file (.txt) to a compressed netcdf file (.nc)

Args:

  • fname_fsm_sim (str): filename to convert from txt to nc
  • complevel (int): Compression level. 1-9

Returns: NULL (FSM simulation file written to disk)


function to_netcdf_parallel

to_netcdf_parallel(
    fsm_sims='./fsm_sims/sim_FSM*.txt',
    n_core=6,
    complevel=9,
    delete_txt_files=False
)

Function to convert FSM simulation output (.txt files) to compressed netcdf. This function parallelize jobs on n_core

Args:

  • fsm_sims (str or list): file pattern or list of file output of FSM. Note that must be in short relative path!. Default = 'fsm_sims/sim_FSM_pt*.txt'
  • n_core (int): number of cores. Default = 6:
  • complevel (int): Compression level. 1-9
  • delete_txt_files (bool): delete simulation txt files or not. Default=True

Returns: NULL (FSM simulation file written to disk)


function to_dataset

to_dataset(fname_pattern='sim_FSM_pt*.txt', fsm_path='./fsm_sims/')

Function to read FSM outputs of one simulation into a single dataset.

Args: fname_pattern: fsm_path:

Returns: dataset (xarray)


function read_pt_fsm

read_pt_fsm(fname)

Function to load FSM simulation output into a pandas dataframe

Args:

  • fname (str): path to simulation file to open

Returns: pandas dataframe


function fsm_sim_parallel

fsm_sim_parallel(
    fsm_input='outputs/FSM_pt*.txt',
    fsm_nconfig=31,
    fsm_nave=24,
    fsm_exec='./FSM',
    n_core=6,
    n_thread=100,
    delete_nlst_files=True
)

Function to run parallelised simulations of FSM

Args:

  • fsm_input (str or list): file pattern or list of file input to FSM. Note that must be in short relative path!. Default = 'outputs/FSM_pt*.txt'
  • fsm_nconfig (int): FSM configuration number. See FSM README.md: https://github.com/RichardEssery/FSM. Default = 31
  • fsm_nave (int): number of timestep to average for outputs. e.g. If input is hourly and fsm_nave=24, outputs will be daily. Default = 24
  • fsm_exec (str): path to FSM executable. Default = './FSM'
  • n_core (int): number of cores. Default = 6
  • n_thread (int): number of threads when creating simulation configuration files. Default=100
  • delete_nlst_files (bool): delete simulation configuration files or not. Default=True

Returns: NULL (FSM simulation file written to disk)


function fsm_sim

fsm_sim(nlstfile, fsm_exec, delete_nlst_files=True)

Function to simulate the FSM model https://github.com/RichardEssery/FSM

Args:

  • nlstfile (int): which FSm configuration to run (integer 1-31)
  • fsm_exec (str): path to input tscale file (relative as Fortran fails with long strings (max 21 chars?))

Returns: NULL (FSM simulation file written to disk)


function agg_by_var_fsm

agg_by_var_fsm(var='snd', fsm_path='./fsm_sims')

Function to make single variable multi cluster files as preprocessing step before spatialisation. This is much more efficient than looping over individual simulation files per cluster. For V variables , C clusters and T timesteps this turns C individual files of dimensions V x T into V individual files of dimensions C x T.

Currently written for FSM files but could be generalised to other models.

Args:

  • var (str): column name of variable to extract, one of: alb, rof, hs, swe, gst, gt50 alb - albedo rof - runoff snd - snow height (m) swe - snow water equivalent (mm) gst - ground surface temp (10cm depth) degC gt50 - ground temperature (50cm depth) degC

  • fsm_path (str): location of simulation files

Returns: dataframe


function agg_by_var_fsm_ensemble

agg_by_var_fsm_ensemble(var='snd', W=1)

Function to make single variable multi cluster files as preprocessing step before spatialisation. This is much more efficient than looping over individual simulation files per cluster. For V variables , C clusters and T timesteps this turns C individual files of dimensions V x T into V individual files of dimensions C x T.

Currently written for FSM files but could be generalised to other models.

Args:

  • var (str): column name of variable to extract, one of: alb, rof, hs, swe, gst, gt50 alb - albedo rof - runoff hs - snow height (m) swe - snow water equivalent (mm) gst - ground surface temp (10cm depth) degC gt50 - ground temperature (50cm depth) degC

W - weight vector from PBS

Returns: dataframe

ncol: 4 = rof 5 = hs 6 = swe 7 = gst


function timeseries_means_period

timeseries_means_period(df, start_date, end_date)

Function to extract results vectors from simulation results. This can be entire time period some subset or sing day.

Args:

  • df (dataframe): results df
  • start_date (str): start date of average (can be used to extract single day) '2020-01-01'
  • end_date (str): end date of average (can be used to extract single day) '2020-01-03'

Returns:

  • dataframe: averaged dataframe

function topo_map

topo_map(df_mean, mydtype, outname='outputmap.tif')

Function to map results to toposub clusters generating map results.

Args:

  • df_mean (dataframe): an array of values to map to dem same length as number of toposub clusters
  • mydtype: HS (maybeGST) needs "float32" other vars can use "int16" to save space

Here 's an approach for arbitrary reclassification of integer rasters that avoids using a million calls to np.where. Rasterio bits taken from @Aaron' s answer: https://gis.stackexchange.com/questions/163007/raster-reclassify-using-python-gdal-and-numpy


function topo_map_headless

topo_map_headless(df_mean, mydtype, outname='outputmap.tif')

Headless server version of Function to map results to toposub clusters generating map results.

Args:

  • df_mean (dataframe): an array of values to map to dem same length as number of toposub clusters
  • mydtype: HS (maybeGST) needs "float32" other vars can use "int16" to save space

Here 's an approach for arbitrary reclassification of integer rasters that avoids using a million calls to np.where. Rasterio bits taken from @Aaron' s answer: https://gis.stackexchange.com/questions/163007/raster-reclassify-using-python-gdal-and-numpy


function topo_map_forcing

topo_map_forcing(ds_var, n_decimals=2, dtype='float32', new_res=None)

Function to map forcing to toposub clusters generating gridded forcings

Args:

  • ds_var: single variable of ds eg. mp.downscaled_pts.t
  • n_decimals (int): number of decimal to round vairable. default 2
  • dtype (str): dtype to export raster. default 'float32'
  • new_res (float): optional parameter to resample output to (in units of projection

Return: - grid_stack: stack of grids with dimension Time x Y x X

Here 's an approach for arbitrary reclassification of integer rasters that avoids using a million calls to np.where. Rasterio bits taken from @Aaron' s answer: https://gis.stackexchange.com/questions/163007/raster-reclassify-using-python-gdal-and-numpy


function topo_map_sim

topo_map_sim(ds_var, n_decimals=2, dtype='float32', new_res=None)

Function to map sim results to toposub clusters generating gridded results

Args:

  • ds_var: single variable of ds eg. mp.downscaled_pts.t
  • n_decimals (int): number of decimal to round vairable. default 2
  • dtype (str): dtype to export raster. default 'float32'
  • new_res (float): optional parameter to resample output to (in units of projection

Return: - grid_stack: stack of grids with dimension Time x Y x X

Here 's an approach for arbitrary reclassification of integer rasters that avoids using a million calls to np.where. Rasterio bits taken from @Aaron' s answer: https://gis.stackexchange.com/questions/163007/raster-reclassify-using-python-gdal-and-numpy


function write_ncdf

write_ncdf(
    wdir,
    grid_stack,
    var,
    units,
    epsg,
    res,
    mytime,
    lats,
    lons,
    mydtype,
    newfile,
    outname=None
)

function agg_stats

agg_stats(df)

function climatology

climatology(HSdf, fsm_path)

function climatology_plot

climatology_plot(
    var,
    mytitle,
    HSdf_daily_median,
    HSdf_daily_quantiles,
    HSdf_realtime=None,
    plot_show=False
)

function climatology_plot2

climatology_plot2(
    mytitle,
    HSdf_daily_median,
    HSdf_daily_quantiles,
    HSdf_realtime=None,
    plot_show=False
)

function concat_fsm

concat_fsm(mydir)

A small routine to concatinate fsm results from separate years in a single file. This is mainly needed when a big job is split into years for parallel processing on eg a cluster

rsync -avz --include="" --include="/" --exclude="" @:/path/to/remote/directory/


This file was automatically generated via lazydocs.