hypertools.describe¶

hypertools.describe(x, reduce='IncrementalPCA', max_dims=None, show=True, format_data=True)[source]¶

Create plot describing covariance with as a function of number of dimensions

This function correlates the raw data with reduced data to get a sense for how well the data can be summarized with n dimensions. Useful for evaluating quality of dimensionality reduced plots.

Parameters

xNumpy array, DataFrame or list of arrays/dfs: A list of Numpy arrays or Pandas Dataframes
reducestr or dict: Decomposition/manifold learning model to use. Models supported: PCA, IncrementalPCA, SparsePCA, MiniBatchSparsePCA, KernelPCA, FastICA, FactorAnalysis, TruncatedSVD, DictionaryLearning, MiniBatchDictionaryLearning, TSNE, Isomap, SpectralEmbedding, LocallyLinearEmbedding, and MDS. Can be passed as a string, but for finer control of the model parameters, pass as a dictionary, e.g. reduce={‘model’ : ‘PCA’, ‘params’ : {‘whiten’ : True}}. See scikit-learn specific model docs for details on parameters supported for each model.
max_dimsint: Maximum number of dimensions to consider
showbool: Plot the result (default : true)
format_databool: Whether or not to first call the format_data function (default: True).

Returns

resultdict: A dictionary with the analysis results. ‘average’ is the correlation by number of components for all data. ‘individual’ is a list of lists, where each list is a correlation by number of components vector (for each input list).