3

Basics

K-means Clustering - Applications

Theory

K-means clustering has numerous practical applications across different domains including image compression, customer segmentation, document classification, and anomaly detection. It's particularly effective when you need to group similar data points without pre-labeled categories.

Visualization

K-means Clustering - Applications visualization

Mathematical Formulation

Applications:
• Image Compression: Reduce colors by clustering pixels
• Customer Segmentation: Group by behavior patterns
• Document Classification: Organize into topics
• Anomaly Detection: Identify outliers
• Feature Engineering: Cluster-based features

Code Example

from sklearn.cluster import KMeans
import numpy as np
from PIL import Image

# Image Compression Example
img = Image.open('image.jpg')
img_array = np.array(img)
h, w, c = img_array.shape

# Reshape for clustering
pixels = img_array.reshape(-1, c)

# Compress to k colors
k = 16
kmeans = KMeans(n_clusters=k, random_state=42)
kmeans.fit(pixels)

# Replace pixels with cluster centers
compressed = kmeans.cluster_centers_[kmeans.labels_]
compressed_img = compressed.reshape(h, w, c).astype(np.uint8)

# Save compressed image
Image.fromarray(compressed_img).save('compressed.jpg')

print(f"Original colors: {len(np.unique(pixels, axis=0))}")
print(f"Compressed to: {k} colors")