To better understand Multiple Discriminant Analysis, let’s first understand Discriminant Analysis. So, Discriminant Analysis is a regression technique that we use in statistics to determine or identify which particular group (for example happy or unhappy) or which particular classification, does a piece of data or an object (for example a citizen) belongs to. Thus, this is a statistician’s technique for classifying and analyzing a pool of data to get the signals and results.
Now, when we have only two groups or categories for allocating our objects/data, we call the technique a Two-Group Discriminant Analysis. But, if we have more than two groups or categories, we call the technique Multiple Discriminant Analysis.
How does the Technique Work?
How do we allocate a piece of data or an object (citizen in our example), to a group? With the help of a function. And our objective in Discriminant Analysis is just that: to establish a function that can ‘discriminate’ or differentiate the objects (citizens) and allocate them to one of the groups (happy or unhappy). We establish this function by undertaking some measurements or observations of the objects (the health of a citizen, his income, quality of education, etc).
How Many Functions are there in Multiple Discriminant Analysis?
While in Two-Group Discriminant Analysis, only one function can categorize the objects, in Multiple Discriminant Analysis, more than one function is generally needed. The exact number of functions in a multiple discriminant analysis equals, either to the value of (g-1), where g is the number of groups or categories or to the value of k, which stands for the number of variables in the analysis, whichever value is smaller. Now, let’s make understanding simple with the help of an example:
So, suppose you have to identify the type of apparels in your stock and you have four categories:
- Mass Market
- Value (Good for money)
You also have five variables or predictors such as:
- Quality of Fabric
- Relevance to the prevailing taste
Here, the number of discriminant functions is three since the value of (g-1)is 3 (4-1) which is lesser than the value of k = 5.
Multiple Discriminant Analysis Application in Finance
This technique is extensively used in financing and investment decisions on a regular basis. Therefore let’s understand the application of this technique in finance with the help of an example. Sam is a beginner in investing. He wants to invest in shares of companies for potential capital gains but he wants to do this carefully. He wants to invest in shares that have a certain degree of risk, a certain degree of promise, and have shown a certain level of favorable performance in the past. In short, he has some variables and he wants to profile different stocks based on those variables. So, using multiple discriminant analysis, he gives a numerical score to his desirable level of risk, promise, and past performance and can group them together as ‘appropriate for investment’.
In companies, it is very common for finance professionals to consider a number of variables before investing money. As shown in the above example, investors and professionals can group potential avenues of investments based on the variables to be considered and then make their decision.
Some Important Points About Multiple Discriminant Analysis
- We know that Multiple Discriminant Analysis has more than one discriminant function. Of all the functions in this analysis, the first one is the most relevant one for discriminating across groups, the second one is the second most relevant function to discriminate, and so on.
- The discriminant functions in the analysis work independently. This means that the observation or measurement we get for an object from the first discriminant function will be unrelated to the observation or measurement we obtain from the second discriminant function.
- This analysis also gives a Canonical Correlation. It measures the extent to which objects are related to the group (the group in which they have been put).
- Another important term in the analysis report is Centroid. A centroid is the mean values of the scores of the objects in the group. Let’s make this simpler to understand. We know that in multiple discriminant analyses we assign each of the objects to a group or category. We assign the objects to a category by first giving the object a score. The mean of all these scores in a group is called the centroid. Each group will have one centroid, hence depending upon the number of groups involved in the analysis, there will be as many centroids as there would be groups.
- We also have a classification matrix or confusion matrix as part of the analysis. This matrix contains the number of correctly classified cases as well as the number of misclassified cases.