scikit-learn: machine learning in Python
Easy-to-use and general-purpose machine learning in Python
scikit-learn is a Python module integrating classic machine learning algorithms in the tightly-knit scientific Python world (numpy, scipy, matplotlib). It aims to provide simple and efficient solutions to learning problems, accessible to everybody and reusable in various contexts: machine-learning as a versatile tool for science and engineering.
License: Open source, commercially usable: BSD license (3 clause)
Documentation for scikit-learn version 0.11-git. For other versions and printable format, see Documentation resources.
User Guide¶
- 1. Installing scikit-learn
 - 2. Tutorials: From the bottom up with scikit-learn
 - 3. Supervised learning
- 3.1. Generalized Linear Models
 - 3.2. Support Vector Machines
 - 3.3. Stochastic Gradient Descent
 - 3.4. Nearest Neighbors
 - 3.5. Gaussian Processes
 - 3.6. Partial Least Squares
 - 3.7. Naive Bayes
 - 3.8. Decision Trees
 - 3.9. Ensemble methods
 - 3.10. Multiclass and multilabel algorithms
 - 3.11. Feature selection
 - 3.12. Semi-Supervised
 - 3.13. Linear and Quadratic Discriminant Analysis
 
 - 4. Unsupervised learning
 - 5. Model Selection
 - 6. Dataset transformations
 - 7. Dataset loading utilities
- 7.1. General dataset API
 - 7.2. Toy datasets
 - 7.3. Sample images
 - 7.4. Sample generators
 - 7.5. Datasets in svmlight / libsvm format
 - 7.6. The Olivetti faces dataset
 - 7.7. The 20 newsgroups text dataset
 - 7.8. Downloading datasets from the mldata.org repository
 - 7.9. The Labeled Faces in the Wild face recognition dataset
 
 - 8. Reference
- 8.1. sklearn.cluster: Clustering
 - 8.2. sklearn.covariance: Covariance Estimators
 - 8.3. sklearn.cross_validation: Cross Validation
 - 8.4. sklearn.datasets: Datasets
 - 8.5. sklearn.decomposition: Matrix Decomposition
 - 8.6. sklearn.ensemble: Ensemble Methods
 - 8.7. sklearn.feature_extraction: Feature Extraction
 - 8.8. sklearn.feature_selection: Feature Selection
 - 8.9. sklearn.gaussian_process: Gaussian Processes
 - 8.10. sklearn.grid_search: Grid Search
 - 8.11. sklearn.hmm: Hidden Markov Models
 - 8.12. sklearn.kernel_approximation Kernel Approximation
 - 8.13. sklearn.semi_supervised Semi-Supervised Learning
 - 8.14. sklearn.lda: Linear Discriminant Analysis
 - 8.15. sklearn.linear_model: Generalized Linear Models
 - 8.16. sklearn.manifold: Manifold Learning
 - 8.17. sklearn.metrics: Metrics
 - 8.18. sklearn.mixture: Gaussian Mixture Models
 - 8.19. sklearn.multiclass: Multiclass and multilabel classification
 - 8.20. sklearn.naive_bayes: Naive Bayes
 - 8.21. sklearn.neighbors: Nearest Neighbors
 - 8.22. sklearn.pls: Partial Least Squares
 - 8.23. sklearn.pipeline: Pipeline
 - 8.24. sklearn.preprocessing: Preprocessing and Normalization
 - 8.25. sklearn.qda: Quadratic Discriminant Analysis
 - 8.26. sklearn.svm: Support Vector Machines
 - 8.27. sklearn.tree: Decision Trees
 - 8.28. sklearn.utils: Utilities
 
 
Example Gallery¶
- Examples
- General examples
 - Examples based on real world datasets
 - Clustering
 - Covariance estimation
 - Decomposition
 - Ensemble methods
 - Gaussian Process for Machine Learning
 - Generalized Linear Models
 - Manifold learning
 - Gaussian Mixture Models
 - Nearest Neighbors
 - Semi Supervised Classification
 - Support Vector Machines
 - Decision Trees
 
 
          














