8.1.5. sklearn.cluster.MeanShift¶
- class sklearn.cluster.MeanShift(bandwidth=None, seeds=None, bin_seeding=False, cluster_all=True)¶
MeanShift clustering
Parameters : bandwidth: float, optional :
Bandwith used in the RBF kernel If not set, the bandwidth is estimated. See clustering.estimate_bandwidth
seeds: array [n_samples, n_features], optional :
Seeds used to initialize kernels. If not set, the seeds are calculated by clustering.get_bin_seeds with bandwidth as the grid size and default values for other parameters.
cluster_all: boolean, default True :
If true, then all points are clustered, even those orphans that are not within any kernel. Orphans are assigned to the nearest kernel. If false, then orphans are given cluster label -1.
Notes
Scalability:
Because this implementation uses a flat kernel and a Ball Tree to look up members of each kernel, the complexity will is to O(T*n*log(n)) in lower dimensions, with n the number of samples and T the number of points. In higher dimensions the complexity will tend towards O(T*n^2).
Scalability can be boosted by using fewer seeds, for examply by using a higher value of min_bin_freq in the get_bin_seeds function.
Note that the estimate_bandwidth function is much less scalable than the mean shift algorithm and will be the bottleneck if it is used.
References
Dorin Comaniciu and Peter Meer, “Mean Shift: A robust approach toward feature space analysis”. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2002. pp. 603-619.
Attributes
cluster_centers_ array, [n_clusters, n_features] Coordinates of cluster centers labels_ : Labels of each point Methods
fit(X) Compute MeanShift get_params([deep]) Get parameters for the estimator set_params(**params) Set the parameters of the estimator. - __init__(bandwidth=None, seeds=None, bin_seeding=False, cluster_all=True)¶
- fit(X)¶
Compute MeanShift
Parameters : X : array [n_samples, n_features]
Input points
- get_params(deep=True)¶
Get parameters for the estimator
Parameters : deep: boolean, optional :
If True, will return the parameters for this estimator and contained subobjects that are estimators.
- set_params(**params)¶
Set the parameters of the estimator.
The method works on simple estimators as well as on nested objects (such as pipelines). The former have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.
Returns : self :