Fisher’s Linear Discriminant
Training a
classifier consists of finding the weights vector that best separates the data.
How you define this separation is what makes classifiers different. In a
least-squares classifier, we find the vector that minimizes the mean squared
error and is optimized through an optimization algorithm like stochastic
gradient descent. In Fisher’s linear discriminant, we attempt to separate the
data based on the distributions rather than adapting the weights vector with
each datapoint. Fisher’s linear discriminant can be used as a supervised
learning classifier. Given labeled data, the classifier can find a set of
weights to draw a decision boundary, classifying the data. Fisher’s linear
discriminant attempts to find the vector that maximizes the separation between
classes of the projected data. Maximizing “separation” can be ambiguous. The
criteria that Fisher’s linear discriminant follows to do this is to maximize
the distance of the projected means and to minimize the projected within-class
variance.
Our mission
Let’s set a
problem, if you can complete it you’ll understand Fisher’s linear
discriminants!
Here are two
bivariate Gaussians with identical covariance matrices and distinct means. We
want to find the vector that best separates the projections of the data. Let's
draw a random vector and plot the projections.
Remember we
are looking at projections of the data onto the vector (the dot product of the
weights vector and the data matrix) and not a decision boundary. The
projections of the data onto this random weight vector can be plotted as a
histogram (image on the right). As you can see when projecting the data onto
the vector and drawing the histogram the two classes of data aren’t well
separated. The goal is to find the line that best separates the two
distributions on the image on the right. To separate the two distributions, we
could first try to maximize the distance between the projected means, meaning
the distributions are, on average, as far as possible from each other. Let’s
draw a line between the two means and plot the histogram of the projections
onto that line.
That’s quite
a bit better, but the projections of the data are not fully separated yet. To
fully separate them, Fisher’s linear discriminant minimizes the within-class
variance of the projections at the same time as maximizing the projections
between the means. It tries to maximise the means as we discussed before to
separate them, but also attempts to make the distributions as tight as
possible. This allows for better separation as you’ll see below.
As you can
see the projections of the data are well separated. We can take an orthogonal
vector from the weights vector to create a decision boundary. The decision
boundary tells us that on either side of the boundary the data can be predicted
to be one class or another. For multivariate gaussian distributions with
identical covariance matrices, this yields an optimal classifier.
conclusion
Fisher's
Linear Discriminant is a supervised learning classifier that aims to separate
data based on their underlying distributions. It finds a vector that maximizes
the separation between classes by maximizing the distance of the projected
means and minimizing the projected within-class variance This step further
tightens the distributions, allowing for even better separation between the
classes. Once the projections are well-separated, an orthogonal vector can be
used to create a decision boundary. This boundary helps in predicting the class
of new data points on either side. Fisher's Linear Discriminant is particularly
effective for multivariate Gaussian distributions with identical covariance
matrices, as it yields an optimal classifier. By understanding and applying the
principles of Fisher's Linear Discriminant, we can improve our ability to classify
data based on their underlying distributions.
References: https://towardsdatascience.com/fishers-linear-discriminant-intuitively-explained-52a1ba79e1bb
ISME Student Doing internship with Hunnarvi
Technologies Pvt Ltd under guidance of Nanobi data and analytics. Views are
personal.
#FisherLinearDiscriminant
#SupervisedLearning #DataClassification #SeparationOfDistributions
#MaximizingMeans #MinimizingVariance #OptimalClassifier #DecisionBoundary
#MultivariateGaussian #DataProjection #InternationalSchoolofManagementExcellence
#NanobiDataandAnalytics #hunnarvi
Comments
Post a Comment