This article discusses how to improve the performance of Pandas operations by using vectorization with NumPy. It highlights alternatives to the apply() method on larger dataframes and provides examples of using NumPy's lesser-known methods like where and select to handle complex if/then/else conditions efficiently.
>>> from sklearn.neighbors import NearestCentroid
>>> import numpy as np
>>> X = np.array( [-1, -1 » , -2, -1 » , -3, -2 » , 1, 1 » , 2, 1 » , 3, 2 » ])
>>> y = np.array( 1, 1, 1, 2, 2, 2 » )
>>> clf = NearestCentroid()
>>> clf.fit(X, y)
NearestCentroid()
>>> print(clf.predict( [-0.8, -1 » ]))
1 »