Fork me on GitHub

sklearn.cluster.mean_shift

sklearn.cluster.mean_shift(X, bandwidth=None, seeds=None, bin_seeding=False, min_bin_freq=1, cluster_all=True, max_iterations=300)

Perform MeanShift Clustering of data using a flat kernel

Seed using a binning technique for scalability.

Parameters :

X : array-like shape=[n_samples, n_features]

Input data.

bandwidth : float, optional

Kernel bandwidth.

If bandwidth is not given, it is determined using a heuristic based on the median of all pairwise distances. This will take quadratic time in the number of samples. The sklearn.cluster.estimate_bandwidth function can be used to do this more efficiently.

seeds : array [n_seeds, n_features]

Point used as initial kernel locations.

bin_seeding : boolean

If true, initial kernel locations are not locations of all points, but rather the location of the discretized version of points, where points are binned onto a grid whose coarseness corresponds to the bandwidth. Setting this option to True will speed up the algorithm because fewer seeds will be initialized. default value: False Ignored if seeds argument is not None.

min_bin_freq : int, optional

To speed up the algorithm, accept only those bins with at least min_bin_freq points as seeds. If not defined, set to 1.

Returns :

cluster_centers : array [n_clusters, n_features]

Coordinates of cluster centers.

labels : array [n_samples]

Cluster labels for each point.

Notes

See examples/cluster/plot_meanshift.py for an example.

Previous
Next