4. Dataset transformations¶
- 4.1. Feature extraction
- 4.1.1. Loading features from dicts
- 4.1.2. Feature hashing
- 4.1.3. Text feature extraction
- 4.1.3.1. The Bag of Words representation
- 4.1.3.2. Sparsity
- 4.1.3.3. Common Vectorizer usage
- 4.1.3.4. Tf–idf term weighting
- 4.1.3.5. Decoding text files
- 4.1.3.6. Applications and examples
- 4.1.3.7. Limitations of the Bag of Words representation
- 4.1.3.8. Vectorizing a large text corpus with the hashing trick
- 4.1.3.9. Performing out-of-core scaling with HashingVectorizer
- 4.1.3.10. Customizing the vectorizer classes
- 4.1.4. Image feature extraction
- 4.2. Preprocessing data
- 4.3. Kernel Approximation
- 4.4. Random Projection
- 4.5. Pairwise metrics, Affinities and Kernels