hypertools.describe

hypertools.describe(x, reduce='IncrementalPCA', max_dims=None, show=True, format_data=True)[source]

Create plot describing covariance with as a function of number of dimensions

This function correlates the raw data with reduced data to get a sense for how well the data can be summarized with n dimensions. Useful for evaluating quality of dimensionality reduced plots.

Parameters
xNumpy array, DataFrame or list of arrays/dfs

A list of Numpy arrays or Pandas Dataframes

reducestr or dict

Decomposition/manifold learning model to use. Models supported: PCA, IncrementalPCA, SparsePCA, MiniBatchSparsePCA, KernelPCA, FastICA, FactorAnalysis, TruncatedSVD, DictionaryLearning, MiniBatchDictionaryLearning, TSNE, Isomap, SpectralEmbedding, LocallyLinearEmbedding, and MDS. Can be passed as a string, but for finer control of the model parameters, pass as a dictionary, e.g. reduce={‘model’ : ‘PCA’, ‘params’ : {‘whiten’ : True}}. See scikit-learn specific model docs for details on parameters supported for each model.

max_dimsint

Maximum number of dimensions to consider

showbool

Plot the result (default : true)

format_databool

Whether or not to first call the format_data function (default: True).

Returns
resultdict

A dictionary with the analysis results. ‘average’ is the correlation by number of components for all data. ‘individual’ is a list of lists, where each list is a correlation by number of components vector (for each input list).