hypertools.cluster¶
- hypertools.cluster(x, cluster='KMeans', n_clusters=3, ndims=None, format_data=True)[source]¶
Performs clustering analysis and returns a list of cluster labels
- Parameters:
- xA Numpy array, Pandas Dataframe or list of arrays/dfs
The data to be clustered. You can pass a single array/df or a list. If a list is passed, the arrays will be stacked and the clustering will be performed across all lists (i.e. not within each list).
- clusterstr or dict
Model to use to discover clusters. Support algorithms are: KMeans, MiniBatchKMeans, AgglomerativeClustering, Birch, FeatureAgglomeration, SpectralClustering and HDBSCAN (default: KMeans). Can be passed as a string, but for finer control of the model parameters, pass as a dictionary, e.g. reduce={‘model’ : ‘KMeans’, ‘params’ : {‘max_iter’ : 100}}. See scikit-learn specific model docs for details on parameters supported for each model.
- n_clustersint
Number of clusters to discover. Not required for HDBSCAN.
- format_databool
Whether or not to first call the format_data function (default: True).
- ndimsNone
Deprecated argument. Please use new analyze function to perform combinations of transformations
- Returns:
- cluster_labelslist
An list of cluster labels