`module` `TopoPyScale.topo_sub`

Clustering routines for TopoSUB

S. Filhol, Oct 2021

TODO:

explore other clustering methods available in scikit-learn: https://scikit-learn.org/stable/modules/clustering.html
look into DBSCAN and its relative

`function` `ds_to_indexed_dataframe`

ds_to_indexed_dataframe(ds)

Function to convert dataset to dataframe

See definition of function in topo_utils.py

Args:

ds (dataset): xarray dataset N * 2D Dataarray

Returns:

`function` `scale_df`

scale_df(
    df_param,
    scaler=StandardScaler(),
    features={'x': 1, 'y': 1, 'elevation': 4, 'slope': 1, 'aspect_cos': 1, 'aspect_sin': 1, 'svf': 1}
)

Function to scale features of a pandas dataframe

Args:

df_param (dataframe): features to scale
scaler (scaler object): Default is StandardScaler()
features (dict): dictionnary of features to use as predictors with their respect importance. {'x':1, 'y':1}

Returns:

dataframe: scaled data

`function` `inverse_scale_df`

inverse_scale_df(
    df_scaled,
    scaler,
    features={'x': 1, 'y': 1, 'elevation': 4, 'slope': 1, 'aspect_cos': 1, 'aspect_sin': 1, 'svf': 1}
)

Function to inverse feature scaling of a pandas dataframe

Args:

df_scaled (dataframe): scaled data to transform back to original (inverse transfrom)
scaler (scaler object): original scikit learn scaler
features (dict): dictionnary of features to use as predictors with their respect importance. {'x':1, 'y':1}

Returns:

dataframe: data in original format

`function` `kmeans_clustering`

kmeans_clustering(
    df_param,
    n_clusters=100,
    features={'x': 1, 'y': 1, 'elevation': 4, 'slope': 1, 'aspect_cos': 1, 'aspect_sin': 1, 'svf': 1},
    seed=None,
    **kwargs
)

Function to perform K-mean clustering

Args:

df_param (dataframe): features
features (dict): dictionnary of features to use as predictors with their respect importance. {'x':1, 'y':1}
n_clusters (int): number of clusters
seed (int): None or int for random seed generator

kwargs:

Returns:

dataframe: df_centers
kmean object: kmeans
dataframe: df_param

`function` `minibatch_kmeans_clustering`

minibatch_kmeans_clustering(
    df_param,
    n_clusters=100,
    features={'x': 1, 'y': 1, 'elevation': 4, 'slope': 1, 'aspect_cos': 1, 'aspect_sin': 1, 'svf': 1},
    n_cores=4,
    seed=None,
    **kwargs
)

Function to perform mini-batch K-mean clustering

Args:

df_param (dataframe): features
n_clusters (int): number of clusters
features (dict): dictionnary of features to use as predictors with their respect importance. {'x':1, 'y':1}
n_cores (int): number of processor core

kwargs:

Returns:

dataframe: centroids
kmean object: kmean model
dataframe: labels of input data

`function` `search_number_of_clusters`

search_number_of_clusters(
    df_param,
    method='minibatchkmean',
    cluster_range=array([100, 300, 500, 700, 900]),
    features={'x': 1, 'y': 1, 'elevation': 4, 'slope': 1, 'aspect_cos': 1, 'aspect_sin': 1, 'svf': 1},
    scaler_type=StandardScaler(),
    scaler=None,
    seed=2,
    plot=True
)

Function to help identify an optimum number of clusters using the elbow method

Args:

df_param (dataframe): pandas dataframe containing input variable to the clustering method
method (str): method for clustering. Currently available: ['minibatchkmean', 'kmeans']
range_n_clusters (array int): array of number of clusters to derive scores for
features (dict): dictionnary of features to use as predictors with their respect importance. {'x':1, 'y':1}
scaler_type (scikit_learn obj): type of scaler to use: e.g. StandardScaler() or RobustScaler()
scaler (scikit_learn obj): fitted scaler to dataset. Implies that df_param is already scaled
seed (int): random seed for kmeans clustering
plot (bool): plot results or not

Returns:

dataframe: wcss score, Davies Boulding score, Calinsky Harabasz score

`function` `plot_center_clusters`

plot_center_clusters(
    dem_file,
    ds_param,
    df_centers,
    var='elevation',
    cmap=<matplotlib.colors.ListedColormap object at 0x7f669fe9c8e0>,
    figsize=(14, 10)
)

Function to plot the location of the cluster centroids over the DEM

Args:

dem_file (str): path to dem raster file
ds_param (dataset): topo_param parameters ['elev', 'slope', 'aspect_cos', 'aspect_sin', 'svf']
df_centers (dataframe): containing cluster centroid parameters ['x', 'y', 'elev', 'slope', 'aspect_cos', 'aspect_sin', 'svf']
var (str): variable to plot as background
cmap (pyplot cmap): pyplot colormap to represent the variable.

`function` `write_landform`

write_landform(
    dem_file,
    df_param,
    project_directory='./',
    out_dir: Optional[str, Path] = None,
    out_name: Optional[str] = None
) → Union[str, Path]

Function to write a landform file which maps cluster ids to dem pixels

Args:

dem_file (str): path to dem raster file
ds_param (dataset): topo_param parameters ['elev', 'slope', 'aspect_cos', 'aspect_sin', 'svf']

This file was automatically generated via lazydocs.

module TopoPyScale.topo_sub

function ds_to_indexed_dataframe

function scale_df

function inverse_scale_df

function kmeans_clustering

function minibatch_kmeans_clustering

function search_number_of_clusters

function plot_center_clusters

function write_landform

`module` `TopoPyScale.topo_sub`

`function` `ds_to_indexed_dataframe`

`function` `scale_df`

`function` `inverse_scale_df`

`function` `kmeans_clustering`

`function` `minibatch_kmeans_clustering`

`function` `search_number_of_clusters`

`function` `plot_center_clusters`

`function` `write_landform`