hypertools.load¶

hypertools.load(dataset, reduce=None, ndims=None, align=None, normalize=None, *, legacy=False)[source]¶

Load a .geo file or example data

Parameters:

datasetstring

The name of the example dataset. Can be a .geo file, or one of a number of example datasets listed below.

weights is list of 2 numpy arrays, each containing average brain activity (fMRI) from 18 subjects listening to the same story, fit using Hierarchical Topographic Factor Analysis (HTFA) with 100 nodes. The rows are fMRI measurements and the columns are parameters of the model.

weights_sample is a sample of 3 subjects from that dataset.

weights_avg is the dataset split in half and averaged into two groups.

spiral is numpy array containing data for a 3D spiral, used to highlight the procrustes function.

mushrooms is a numpy array comprised of features (columns) of a collection of 8,124 mushroomm samples (rows).

sotus is a collection of State of the Union speeches from 1989-2018.

wiki is a collection of wikipedia pages used to fit wiki-model.

wiki-model is a sklearn Pipeline (CountVectorizer->LatentDirichletAllocation) trained on a sample of wikipedia articles. It can be used to transform text to topic vectors.

normalizestr or False or None

If set to ‘across’, the columns of the input data will be z-scored across lists (default). That is, the z-scores will be computed with with respect to column n across all arrays passed in the list. If set to ‘within’, the columns will be z-scored within each list that is passed. If set to ‘row’, each row of the input data will be z-scored. If set to False, the input data will be returned with no z-scoring.

reducestr or dict

Decomposition/manifold learning model to use. Models supported: PCA, IncrementalPCA, SparsePCA, MiniBatchSparsePCA, KernelPCA, FastICA, FactorAnalysis, TruncatedSVD, DictionaryLearning, MiniBatchDictionaryLearning, TSNE, Isomap, SpectralEmbedding, LocallyLinearEmbedding, and MDS. Can be passed as a string, but for finer control of the model parameters, pass as a dictionary, e.g. reduce={‘model’ : ‘PCA’, ‘params’ : {‘whiten’ : True}}. See scikit-learn specific model docs for details on parameters supported for each model.

ndimsint

Number of dimensions to reduce

alignstr or dict

If str, either ‘hyper’ or ‘SRM’. If ‘hyper’, alignment algorithm will be hyperalignment. If ‘SRM’, alignment algorithm will be shared response model. You can also pass a dictionary for finer control, where the ‘model’ key is a string that specifies the model and the params key is a dictionary of parameter values (default : ‘hyper’).

legacybool

Pass legacy=True to load DataGeometry objects created with hypertools<0.8.0

Returns:

dataNumpy Array: Example data