hypertools.plot

hypertools.plot(x, fmt='-', marker=None, markers=None, linestyle=None, linestyles=None, color=None, colors=None, palette='hls', group=None, hue=None, labels=None, legend=None, title=None, size=None, elev=10, azim=-60, ndims=3, model=None, model_params=None, reduce='IncrementalPCA', cluster=None, align=None, normalize=None, n_clusters=None, save_path=None, animate=False, duration=30, tail_duration=2, rotations=2, zoom=1, chemtrails=False, precog=False, bullettime=False, frame_rate=50, explore=False, show=True, transform=None, vectorizer='CountVectorizer', semantic='LatentDirichletAllocation', corpus='wiki', ax=None)[source]

Plots dimensionality reduced data and parses plot arguments

Parameters
xNumpy array, DataFrame, String, Geo or mixed list

Data for the plot. The form should be samples (rows) by features (cols).

fmtstr or list of strings

A list of format strings. All matplotlib format strings are supported.

linestyle(s)str or list of str

A list of line styles

marker(s)str or list of str

A list of marker types

color(s)str or list of str

A list of marker types

palettestr

A matplotlib or seaborn color palette

groupstr/int/float or list

A list of group labels. Length must match the number of rows in your dataset. If the data type is numerical, the values will be mapped to rgb values in the specified palette. If the data type is strings, the points will be labeled categorically. To label a subset of points, use None (i.e. [‘a’, None, ‘b’,’a’]).

labelslist

A list of labels for each point. Must be dimensionality of data (x). If no label is wanted for a particular point, input None.

legendlist or bool

If set to True, legend is implicitly computed from data. Passing a list will add string labels to the legend (one for each list item).

titlestr

A title for the plot

sizelist

A list of [width, height] in inches to resize the figure

normalizestr or False

If set to ‘across’, the columns of the input data will be z-scored across lists (default). If set to ‘within’, the columns will be z-scored within each list that is passed. If set to ‘row’, each row of the input data will be z-scored. If set to False, the input data will be returned (default is False).

reducestr or dict

Decomposition/manifold learning model to use. Models supported: PCA, IncrementalPCA, SparsePCA, MiniBatchSparsePCA, KernelPCA, FastICA, FactorAnalysis, TruncatedSVD, DictionaryLearning, MiniBatchDictionaryLearning, TSNE, Isomap, SpectralEmbedding, LocallyLinearEmbedding, and MDS. Can be passed as a string, but for finer control of the model parameters, pass as a dictionary, e.g. reduce={‘model’ : ‘PCA’, ‘params’ : {‘whiten’ : True}}. See scikit-learn specific model docs for details on parameters supported for each model.

ndimsint

An int representing the number of dims to reduce the data x to. If ndims > 3, will plot in 3 dimensions but return the higher dimensional data. Default is None, which will plot data in 3 dimensions and return the data with the same number of dimensions possibly normalized and/or aligned according to normalize/align kwargs.

alignstr or dict or False/None

If str, either ‘hyper’ or ‘SRM’. If ‘hyper’, alignment algorithm will be hyperalignment. If ‘SRM’, alignment algorithm will be shared response model. You can also pass a dictionary for finer control, where the ‘model’ key is a string that specifies the model and the params key is a dictionary of parameter values (default : ‘hyper’).

clusterstr or dict or False/None

If cluster is passed, HyperTools will perform clustering using the specified clustering clustering model. Supportted algorithms are: KMeans, MiniBatchKMeans, AgglomerativeClustering, Birch, FeatureAgglomeration, SpectralClustering and HDBSCAN (default: None). Can be passed as a string, but for finer control of the model parameters, pass as a dictionary, e.g. reduce={‘model’ : ‘KMeans’, ‘params’ : {‘max_iter’ : 100}}. See scikit-learn specific model docs for details on parameters supported for each model. If no parameters are specified in the string a default set of parameters will be used.

n_clustersint

If n_clusters is passed, HyperTools will perform k-means clustering with the k parameter set to n_clusters. The resulting clusters will be plotted in different colors according to the color palette.

save_pathstr

Path to save the image/movie. Must include the file extension in the save path (i.e. save_path=’/path/to/file/image.png’). NOTE: If saving an animation, FFMPEG must be installed (this is a matplotlib req). FFMPEG can be easily installed on a mac via homebrew brew install ffmpeg or linux via apt-get apt-get install ffmpeg. If you don’t have homebrew (mac only), you can install it like this: /usr/bin/ruby -e “$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)”.

animatebool, ‘parallel’ or ‘spin’

If True or ‘parallel’, plots the data as an animated trajectory, with each dataset plotted simultaneously. If ‘spin’, all the data is plotted at once but the camera spins around the plot (default: False).

duration (animation only)float

Length of the animation in seconds (default: 30 seconds)

tail_duration (animation only)float

Sets the length of the tail of the data (default: 2 seconds)

rotations (animation only)float

Number of rotations around the box (default: 2)

zoom (animation only)float

How far to zoom into the plot, positive numbers will zoom in (default: 0)

chemtrails (animation only)bool

A low-opacity trail is left behind the trajectory (default: False).

precog (animation only)bool

A low-opacity trail is plotted ahead of the trajectory (default: False).

bullettime (animation only)bool

A low-opacity trail is plotted ahead and behind the trajectory (default: False).

frame_rate (animation only)int or float

Frame rate for animation (default: 50)

explorebool

Displays user defined labels will appear on hover. If no labels are passed, the point index and coordinate will be plotted. To use, set explore=True. Note: Explore mode is currently only supported for 3D static plots, and is an experimental feature (i.e it may not yet work properly).

showbool

If set to False, the figure will not be displayed, but the figure, axis and data objects will still be returned (default: True).

transformlist of numpy arrays or None

The transformed data, bypasses transformations if this is set (default : None).

vectorizerstr, dict, class or class instance

The vectorizer to use. Built-in options are ‘CountVectorizer’ or ‘TfidfVectorizer’. To change default parameters, set to a dictionary e.g. {‘model’ : ‘CountVectorizer’, ‘params’ : {‘max_features’ : 10}}. See http://scikit-learn.org/stable/modules/classes.html#module-sklearn.feature_extraction.text for details. You can also specify your own vectorizer model as a class, or class instance. With either option, the class must have a fit_transform method (see here: http://scikit-learn.org/stable/data_transforms.html). If a class, pass any parameters as a dictionary to vectorizer_params. If a class instance, no parameters can be passed.

semanticstr, dict, class or class instance

Text model to use to transform text data. Built-in options are ‘LatentDirichletAllocation’ or ‘NMF’ (default: LDA). To change default parameters, set to a dictionary e.g. {‘model’ : ‘NMF’, ‘params’ : {‘n_components’ : 10}}. See http://scikit-learn.org/stable/modules/classes.html#module-sklearn.decomposition for details on the two model options. You can also specify your own text model as a class, or class instance. With either option, the class must have a fit_transform method (see here: http://scikit-learn.org/stable/data_transforms.html). If a class, pass any parameters as a dictionary to text_params. If a class instance, no parameters can be passed.

corpuslist (or list of lists) of text samples or ‘wiki’, ‘nips’, ‘sotus’.

Text to use to fit the semantic model (optional). If set to ‘wiki’, ‘nips’ or ‘sotus’ and the default semantic and vectorizer models are used, a pretrained model will be loaded which can save a lot of time.

axmatplotlib.Axes

Axis handle to plot the figure

Returns
geohypertools.DataGeometry

A new data geometry object