{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "\n# Normalizing your features\n\nOften times its useful to normalize (z-score) you features before plotting, so\nthat they are on the same scale. Otherwise, some features will be weighted more\nheavily than others when doing PCA, and that may or may not be what you want.\nThe `normalize` kwarg can be passed to the plot function. If `normalize` is\nset to 'across', the zscore will be computed for the column across all of the\nlists passed. Conversely, if `normalize` is set to 'within', the z-score will\nbe computed separately for each column in each list. Finally, if `normalize` is\nset to 'row', each row of the matrix will be zscored. Alternatively, you can use\nthe normalize function found in tools (see the third example).\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# Code source: Andrew Heusser\n# License: MIT\n\n# import\nimport hypertools as hyp\nimport numpy as np\nimport matplotlib.pyplot as plt\n\n# simulate data\ncluster1 = np.random.multivariate_normal(np.zeros(3), np.eye(3), size=100)\ncluster2 = np.random.multivariate_normal(np.zeros(3)+10, np.eye(3), size=100)\ndata = [cluster1, cluster2]\n\n# plot normalized across lists\nhyp.plot(data, '.', normalize='across', title='Normalized across datasets')\n\n# plot normalized within list\nhyp.plot(data, '.', normalize='within', title='Normalized within dataset')\n\n# normalize by row\nnormalized_row = hyp.normalize(data, normalize='row')\n\n# plot normalized by row\nhyp.plot(normalized_row, '.', title='Normalized across row')" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.4" } }, "nbformat": 4, "nbformat_minor": 0 }