module TopoPyScale.topo_da
Data assimilation methods including ensemble simulation, satelite data retrival and processing and data assimilation algorithms J. Fiddes, March 2022
TODO:
- method to mosaic (gdal) multiple tiles before cropping
function lognormDraws_kris
Function to generate n random draws from a log normal distribution
Args:
n: integer number of draws to makemed: median of distribritions: standard deviation of distribution
Returns: Vector of draws
function normDraws
Function to generate n random draws from a normal distribution
Args:
n: integer number of draws to makem: mean of distribritions: standard deviation of distribution
Returns: Vector of draws
function ensemble_pars_gen
Function to generate perturbabion parameters to create an ensemble of meteorological forcings
Args:
n: size of the ensemble (integer)
Returns: array of perturbation factors of dimension n
function ensemble_meteo_gen
Function to perturb downscaled meteorological timeseries
Args:
ds: downscaled_ptsperturb: matrix of perturbation parametersN_ens: ensemble number (integer)ensemb_type: string describing which meteo variables to perturb: "T" = Tair, "P" = Precip, "S" = Swin, "L" = Lwin
Returns: ds:
function construct_HX
Function to generate HX matrix of ensembleSamples x day
Args:
wdir: working directoryDAyear: year of data assimilation yyyystartDA: start of data assimilation window yyyy-mm-ddendDA: end of data assimilation window yyyy-mm-dd
Returns:
ensembleRes: 2D pandas dataframe with dims cols (samples * ensembles) and rows (days / simulation timesteps)
function modis_tile
Function to retrieve MODIS tile indexes based on a lon/lat bbox
DEPRECATED - DEPRECATED! Does not conside rectangle v rhombus spatial footprint intersection.
use get_modis_tile_list
Args:
Returns:
function get_modis_wrapper
Function to iterate through several MODIS tile downloads.
Args:
Returns:
function get_modis_tile_list
Retrieves a list of MODIS tiles that intersect with a given bounding box.
Parameters:
- bbox_n (float): North latitude of the bounding box.
- bbox_s (float): South latitude of the bounding box.
- bbox_e (float): East longitude of the bounding box.
- bbox_w (float): West longitude of the bounding box.
Returns:
- list: List of MODIS tiles (e.g., ['h23v04', 'h24v05']) that intersect with the bounding box.
Usage: tiles_to_get = get_modis_tile_list( 42.305,41.223, 69.950, 71.981)
function pymodis_download
function projFromLandform
function process_modis
Function to convert hdf to tiff, extract layer, and reproject
Args:
modpath: path to 'modis' directorylayer: layer to extract from hdfepsg: reprojected target epsg codebbox: output bounds as list [minX, minY, maxX, maxY] in target
Returns: Spatially subsetted reprojected tiffs of data layer specified (still NDSI)
Docs: https://nsidc.org/sites/nsidc.org/files/technical-references/C6.1_MODIS_Snow_User_Guide.pdf
function process_modis2
Function to convert hdf to tiff, extract layer, and reproject
Args:
modpath: path to 'modis' directorylayer: layer to extract from hdfepsg: reprojected target epsg codebbox: output bounds as list [minX, minY, maxX, maxY] in target
Returns: Spatially subsetted reprojected tiffs of data layer specified (still NDSI)
Docs: https://nsidc.org/sites/nsidc.org/files/technical-references/C6.1_MODIS_Snow_User_Guide.pdf
function extract_fsca_timeseries
Function to extract mean NDSI from domain and converts to fSCA using the empirical formula based on landsat https://modis-snow-ice.gsfc.nasa.gov/uploads/C6_MODIS_Snow_User_Guide.pdf pp.22
Args:
wdir: working directoryplot: booleen whether to plot dataframe
Returns:
df_sort: dataframe of julian dates and fSCA values
function particle_batch_smoother
function EnKA
EnKA: Implmentation of the Ensemble Kalman Analysis Inputs: prior: Prior ensemble matrix (n x N array) obs: Observation vector (m x 1 array) pred: Predicted observation ensemble matrix (m x N array) alpha: Observation error inflation coefficient (scalar) R: Observation error covariance matrix (m x m array, m x 1 array, or scalar) Outputs: post: Posterior ensemble matrix (n x N array) Dimensions: N is the number of ensemble members, n is the number of state variables and/or parameters, and m is the number of boservations.
For now, we have impelemnted the classic (stochastic) version of the Ensemble Kalman analysis that involves perturbing the observations. Deterministic variants (Sakov and Oke, 2008) also exist that may be worth looking into. This analysis step can be used both for filtering, i.e. Ensemble Kalman filter, when observations are assimilated sequentially or smoothing, i.e. Ensemble (Kalman) smoother, when also be carried out in a multiple data assimilation (i.e. iterative) approach.
Since the author rusty in python it is possible that this code can be improved considerably. Below are also some examples of further developments that may be worth looking into:
To do/try list: (i) Using a Bessel correction (Ne-1 normalization) for sample covariances. (ii) Perturbing the predicted observations rather than the observations following van Leeuwen (2020, doi: 10.1002/qj.3819) which is the formally when defining the covariances matrices. (iii) Looking into Sect. 4.5.3. in Evensen (2019, doi: 10.1007/s10596-019-9819-z) (iv) Using a deterministic rather than stochastic analysis step. (v) Augmenting with artificial momentum to get some inertia. (vi) Better understand/build on the underlying tempering/annealing idea in ES-MDA. (vii) Versions that may scale better for large n such as those working in the ensemble subspace (see e.g. formulations in Evensen, 2003, Ocean Dynamics).
Code by: K. Aalstad (10.12.2020) based on an earlier Matlab version.
function PBS
PBS: Implmentation of the Particle Batch Smoother Inputs: obs: Observation vector (m x 1 array) pred: Predicted observation ensemble matrix (m x N array) R: Observation error covariance 'matrix' (m x 1 array, or scalar) Outputs: w: Posterior weights (N x 1 array) Dimensions: N is the number of ensemble members and m is the number of boservations.
Here we have implemented the particle batch smoother, which is a batch-smoother version of the particle filter (i.e. a particle filter without resampling), described in Margulis et al. (2015, doi: 10.1175/JHM-D-14-0177.1). As such, this routine can also be used for particle filtering with sequential data assimilation. This scheme is obtained by using a particle (mixture of Dirac delta functions) representation of the prior and applying this directly to Bayes theorem. In other words, it is just an application of importance sampling with the prior as the proposal density (importance distribution). It is also the same as the Generalized Likelihood Uncertainty Estimation (GLUE) technique (with a formal Gaussian likelihood) which is widely used in hydrology.
This particular scheme assumes that the observation errors are additive Gaussian white noise (uncorrelated in time and space). The "logsumexp" trick is used to deal with potential numerical issues with floating point operations that occur when dealing with ratios of likelihoods that vary by many orders of magnitude.
To do/try list: (i) Implement an option for correlated observation errors if R is specified as a matrix (this would slow down this function). (ii) Consier other likelihoods and observation models. (iii) Look into iterative versions of importance sampling.
Code by: K. Aalstad (14.12.2020) based on an earlier Matlab version.
function da_plots
Function to extract mean fSSCA from domain
Args:
HX1: HX2:
- weights: weights output from PBS (dims 1 x NENS)
- dates: datetime object
- myobs: timeseries of observtions
Returns: plot
function fsca_plots
Function to plot side by side observed fSCA 500m [0-1] and modelled SCA 30 m [0,1]
Args:
wdir: sim directoryplotDay: date to plot as YYYY-mm-dddf_mean: timeseries mean object given by sim.timeseries_means_period()
Returns: plot
Usage: df = sim.agg_by_var_fsm(6) df2 = sim.agg_by_var_fsm_ensemble(6, W) df_mean = sim.timeseries_means_period(df, "2018-06-01", "2018-06-01") da.fsca_plots("/home/joel/sim/topoPyscale_davos/", "2018150", df_mean)
TODO:
- create modelled fSCA option from agreggated 30m to 500m pixels
- accept datetime object instead of julday.
function da_compare_plot
Function to plot side by side observed fSCA 500m [0-1] and modelled SCA 30 m [0,1]
Args:
wdir: sim directoryplotDay: date to plot as YYYY-mm-dddf_mean: timeseries mean object given by sim.timeseries_means_period()
Returns: plot
Usage: df = sim.agg_by_var_fsm(6) df2 = sim.agg_by_var_fsm_ensemble(6, W) df_mean = sim.timeseries_means_period(df, "2018-06-01", "2018-06-01") da.fsca_plots("/home/joel/sim/topoPyscale_davos/", "2018150", df_mean)
TODO:
- create modelled fSCA option from agreggated 30m to 500m pixels
- accept datetime object instead of julday.
function build_modis_cube
function getModisbbox
Function to find obtain bbox of modis pixel CRS is projected UTM
Args:
Npixel: pixel nuber as integer of 0 - (xdim x ydim) (the flattened grid) eg for a 10 x 10 grid Npixel would have a range of 0-100arr: xarray DataArray with dimensions "x" and "y" and attribute "res"
Returns:
bbox: bbox of pixel as list [xmax,xmin, ymax, ymin]
This file was automatically generated via lazydocs.