Introduction
Linear Discriminant Analysis (LDA) is a supervised learning algorithm used for classification and dimensionality reduction. The goal is to project high-dimensional data onto a lower-dimensional space such that the separability between classes is maximized.
Step-by-Step LDA Calculation for the Given Dataset
Step 1: Define the datasets
Class X1:
(4, 2), (2, 2), (3, 2), (3, 5), (3, 4)
Class X2:
(8, 7), (9, 6), (7, 7), (9, 8), (10, 9)
Step 2: Compute the mean of each class
Mean of X1 (μ1):
μ1 = [(4+2+3+3+3)/5 , (2+2+2+5+4)/5] = (15/5, 15/5) = (3, 3)
Mean of X2 (μ2):
μ2 = [(8+9+7+9+10)/5 , (7+6+7+8+9)/5] = (43/5, 37/5) = (8.6, 7.4)
Step 3: Compute the Within-Class Scatter Matrix (SW)
This is calculated as the sum of the scatter matrices of each class:
SW = S1 + S2
For each point, compute:
(x - μ)(x - μ)^T
and then sum over all the points in both classes. Due to length constraints, we summarize:
- S1 = scatter matrix for class X1
- S2 = scatter matrix for class X2
Step 4: Compute the Optimal Projection Vector (w)
This is given by:
w = SW-1 (μ1 − μ2)
After solving, you get the direction (w) on which the data should be projected.
Step 5: Project data points
Project each point in both classes onto vector w using:
y = x · w
This gives you the 1D representations which you can use for classification.
Conclusion
Linear Discriminant Analysis helps in reducing dimensionality while preserving as much class discriminatory information as possible. In this case, the classes X1 and X2 can be projected onto a line where they are most linearly separable, improving classification performance.