Dimensionality reduction ======================== The ``reduce`` function reduces the dimensionality of an array or list of arrays. The default is to use Principal Component Analysis to reduce to three dimensions, but a variety of models are supported and users may specify a desired number of dimensions other than three. Supported models include: PCA, IncrementalPCA, SparsePCA, MiniBatchSparsePCA, KernelPCA, FastICA, FactorAnalysis, TruncatedSVD, DictionaryLearning, MiniBatchDictionaryLearning, TSNE, Isomap, SpectralEmbedding, LocallyLinearEmbedding, MDS and UMAP. Import Hypertools ----------------- .. code:: ipython3 import hypertools as hyp %matplotlib inline Load your data -------------- First, we'll load one of the sample datasets. This dataset is a list of 2 ``numpy`` arrays, each containing average brain activity (fMRI) from 18 subjects listening to the same story, fit using Hierarchical Topographic Factor Analysis (HTFA) with 100 nodes. The rows are timepoints and the columns are fMRI components. See the `full dataset `__ or the `HTFA article `__ for more info on the data and HTFA, respectively. .. code:: ipython3 geo = hyp.load('weights_avg') weights = geo.get_data() Reduce one array ---------------- Let's look at one array from the dataset above. .. code:: ipython3 print('Array shape: (%d, %d)' % weights[0].shape) .. parsed-literal:: Array shape: (100, 100) To reduce this array, simply pass the array to ``hyp.reduce``, as below. We can see that the data has been reduced from 100 features to 3 features (the default when desired number of features is not specified). .. code:: ipython3 reduced_array = hyp.reduce(weights[0]) print('Reduced array shape: (%d, %d)' % reduced_array[0].shape) .. parsed-literal:: Reduced array shape: (100, 100) Reduce list of arrays --------------------- A list or numpy array of multiple arrays can also be reduced into a common space. That is, the data can be combined, reduced as a whole, then split back into individual elements and outputted via hyp.reduce. Here we show this with two arrays in the weights dataset. First, let's examine the arrays in the weights dataset (below). Now, let's reduce both arrays at once (by passing in the whole of the weights data) and re-examine the data. .. code:: ipython3 reduced_arrays = hyp.reduce(weights) print('Shape of first reduced array: ', reduced_arrays[0].shape) print('Shape of second reduced array: ', reduced_arrays[1].shape) .. parsed-literal:: Shape of first reduced array: (100, 100) Shape of second reduced array: (100, 100) We can see that each array has been reduced from 100 features to 3 features (the default when desired number of features is not specified), with the number of datapoints unchanged. Reduce list of arrays (TSNE) ---------------------------- You can also opt to use different reduction methods. In the example below, we reduce multiple arrays at once, using TSNE. The data is reduced to three dimensions(the default when desired number of features not specified). .. code:: ipython3 reduced_TSNE = hyp.reduce(weights, reduce='TSNE') print('Shape of first reduced array: ',reduced_TSNE[0].shape) print('Shape of second reduced array: ',reduced_TSNE[1].shape) .. parsed-literal:: Shape of first reduced array: (100, 100) Shape of second reduced array: (100, 100) Reduce to specified number of dimensions ---------------------------------------- You may prefer to reduce to a specific number of features, rather than defaulting the three dimensions. To achieve this, simply pass the number of desired features (as an int) to the ndims argument, as below. .. code:: ipython3 reduced_4 = hyp.reduce(weights, ndims = 4) print('Shape of first reduced array: ', reduced_4[0].shape) print('Shape of second reduced array: ', reduced_4[1].shape) .. parsed-literal:: Shape of first reduced array: (100, 4) Shape of second reduced array: (100, 4) Reduce list of arrays with specific parameters ---------------------------------------------- For finer control of parameters, a dictionary of model parameters may be passed to the reduce argument, in addition to the desired reduction method. See `scikit-learn `__ model docs for details on parameters supported for each model. Supported models include: PCA, IncrementalPCA, SparsePCA, MiniBatchSparsePCA, KernelPCA, FastICA, FactorAnalysis, TruncatedSVD, DictionaryLearning, MiniBatchDictionaryLearning, TSNE, Isomap, SpectralEmbedding, LocallyLinearEmbedding, and MDS. The example below will reduce to the default of three features, since the desired number of features is not specified. .. code:: ipython3 reduced_params = hyp.reduce(weights, reduce={'model' : 'PCA', 'params' : {'whiten' : True}})