Comparison of Principal Component Analysis and Maximum Likelihood Factor Analysis in Bank Health Ratio

The use of factor analysis methods to reduce variable dimensions is generally known and has been used in various disciplines. The two famous extraction methods of factor analysis are principal component analysis and maximum likelihood. This study aimed to compare both, principal component analysis and maximum likelihood. By their constructed matrix correlation, applied to bank financial ratios. The study is developed from an initial set of 22 ratios of healthy indexed banks. The use of bank financial data aims to identify the structure of the financial ratio of healthy indexed banks. There are 10 variables satisfying the criteria of factor analysis techniques to be considered in the analysis. Both principal component analysis and maximum likelihood suggest three factors that can be used to represent 10 variables.


INTRODUCTION
The use of factor analysis methods to reduce variable dimensions is generally known and has been used in various disciplines. Describing covariance among many variables into few underlying but unobservable random quantities named factors is the essential motivation of factor analysis (Johnson & Wichern, 2007). The main idea of factor model is being motivated by argument that variables can be classed by observing their correlations. A group of variables is highly correlated among themselves and relatively has low correlations with other variables in a different group. Then it is plausible that observation of correlation is in charged by the group. Where the group describe a factor or single underlying construct (Everitt, 2005;Härdle & Simar, 2019).
Several extraction methods are available in some statistical application, such as Principal Components, Generalized Least Squares, Unweighted Least Squares, Principal Axis Factoring, Maximum Likelihood, Alpha Factoring and Image Factoring. Among these methods, the principal component (PC) method and the maximum likelihood (ML) method is the most popular ones. It is always better to try more than one method of solution. A consistent solution makes factor model fit for the problem at hand despite using several different methods (Rencher, 1998).
Comparison between the two extraction methods is our main focus in this study. In achieving this objective, we apply those two methods into banks financial ratios. Many researcher shows that PC and/or ML factor analysis could be applied to financial ratios. Ratio analysis is a very powerful analytical tool to measure the performance of an organization. Painting the financial forecast of the business and defining company's health is purpose of the financial ratios (Joliffe, 2010).
Several papers that discussed the principal component analysis and / or maximum likelihood are about: selecting significant factors by noise addition method (Dable & Booksh, 2001); estimating population density from chord-length delay (Grover, Martha A.;Barthe, Stephanie C.;Rousseau, 2009); estimating frequency response functions in single input single output in the presence of additive noise (White et al., 2006); analyzing incomplete multivariate data (Ho et al., 2001); comparing between PCA and ICA (Bugli & Lambert, 2007); and modeling individual growth (Lehmann et al., 2010).

FACTOR ANALYSIS
The most important concept for factor analysis is that observed random variables , ,... , can be defined as linear functions of called common factors. If , ,... , are the variables and , ,..., are the factors, then (Everitt, 2005;Härdle & Simar, 2019;Rencher, 1998) . (1) Where ; are constants known as loading factors, , are error terms, sometimes called unique factors (due the fact is 'unique' to , while the are 'common' to several ), and , are mean of . Equation above can be written in matrix form as (2) These are number of suppositions related with the factor model, as follows: F and e are independent, Cov ( , ) = 0 , where is diagonal matrix Covariance structure for X is implied by the orthogonal factor model. From the model in (2), we can meet the covariance of X, Σ as (4)

PRINCIPAL COMPONENT ANALYSIS
From a random samples , ,... , , we acquire the sample covariance matrix S and afterward we endeavor to dicover an estimator ̂ that will approximate the fundamental expression (4) with S in place of (Everitt, 2005;Härdle & Simar, 2019;Joliffe, 2010;Rencher, 1998): = ̂ ̂ ̂ (5) In principal component analysis, we neglect ̂ and factor into ̂ ̂ In order to factor S, we use spectral decomposition, where is a diagonal matrix with the eigenvalues θ 1 , θ 2 ,…, θ p of on the diagonal and C is an orthogonal matrix constructed with normalized eigenvectors ( ) of as columns. To finish factoring into the form ̂ ̂ , considering the eigenvalues θ i of the positive semidefinite matrix S are all positive or zero, we are able to factor D into D 1/2 D 1/2 . With this factoring of D, we can rewrite S to be S (7) Equation (7) is of the form ̂ ̂ , but we don't define ̂ to be CD 1/2 because CD 1/2 is , and we are seeking a ̂ that is with . We therefore define D 1 = diag(θ 1 , θ 2 , . . . , θ m ) with the largest eigenvalues θ 1 >θ 2 , . . . ,> θ m and C 1 = (c 1 , c 2 , . . . , c m ) containing the corresponding eigenvectors. We then estimate by the first m columns of

MAXIMUM LIKELIHOOD
Maximum likelihood estimate loading factors and specific variances by the necessary assumption of common factors F and the specific factors e can be assumed to be normally distributed. Since and are jointly normal, the observations . are then normal, and the likelihood is (Härdle & Simar, 2019;Rencher, 1998 Which relies on L and through = LL' + . This model is as yet not well defined, because of orthogonal transformations made possible the multiplicity of choices for L. making L to be well defined is possible by applying computationally convenient uniqueness condition as diagonal matrix. Numerical maximization of (9) is way maximum likelihood estimates ̂ and ̂. Recently, there are many efficient computer programs to get these estimations rather easily (Härdle & Simar, 2019;Joliffe, 2010;Rencher, 1998).

BANK HEALTH
The purpose of the Bank's Health Assessment is to determine whether the bank is in a very healthy, healthy, fairly healthy, and unhealthy condition. This assessment is done by examining financial ratios. Financial ratio analysis is a comparative analysis between two elements of financial statements that show financial health at a certain time. It reflects bank's condition in certain time.
Journal homepage: https://jurnal.unej.ac.id/index.php/JID Company's health is assessed from five major aspects such as capital, assets quality, management, earning, and liquidity. Some author offer a special ratio based on the type of organization. There are several kinds of financial ratios derived from that major aspect. As an example, financial ratios for capital aspect, it can consists of capital adequacy ratio, core capital to total capital, and capital to total asset (Ardiningsih, 2001).

VARIABLES AND DATA
The variables in this study were 22 financial ratios that used to measure the health of banks. It is secondary data from the Indonesian banks health research in the year 2016 conducted by the research institute PT. Bali Data Analysis. The data was financial ratios that examined to index banks into very healthy, healthy, fairly healthy and unhealthy. PT. Bali Data Analysis conducted their research using Camels Analysis that widely used among economic researcher.
Sample data taken from quarterly financial ratios from 31 banks indexed very healthy and healthy in 2016. Then, total sample data examined in this article is 124. Those financial ratio variables are shown in Table 1. Measure sampling adequacy (MSA) is a statistical test to measure a variable that can be predicted by other variables. This test is done by comparing the correlation of the observed variable pair with the partial correlation. The value of the MSA ranges from 0 to 1 with significant value is 0.5. This test is an index of each variable that explains whether the data in the study is sufficient to make the variables partially interrelated. While MSA analyzed adequacy of each variable individually, Bartlett's test of sphericity and KMO are statistical tests to analyze the whole variables.

METHODS
In this study we use 'eigen value which is greater than 1' to determine how many factor to be retained. Then, we use the communalities, factors retained, variances explained and constructed correlation matrices to compare these two methods of extraction.

RESULTS AND DISCUSSION
At first, the data are tested by checking the value of MSA. This is done by checking the diagonal of anti-image correlation. From 22 observed variables, there are just 10 variables satisfying the criteria of MSA while the other 12 variables did not satisfy the criteria. It means we exclude those 12 variables from next step of research. Another criteria are value of KMO and Bartlett's test of sphericity. The value of these two criteria obtained by SPSS and shown at Figure 1. The value of KMO and Bartlett's test of sphericity of 10 variables consecutively are 0.686 and 569.858, which let us to analyze data into factor analysis.  Table 2 shows the communalities retained by both methods. Communality indicates variance of observed variable explained by constructed factor. The number of communalities using Principal Component that larger than those of the one using Maximum Likelihood is seven out of ten. Therefore, using this comparison, PC method performs better than ML method. There are three Eigen values in Table 3 which are greater than 1, retained by both extractions method, then three factors retained in this study. PC gives total variance 67.65% by retaining 3 factors, with variance explained by Factor 1, Factor 2, and Factor 3 consecutively are 37.77%, 19.87%, and 10.00%. While ML gives total variance 58.97% by retaining 3 factors, with variance explained by Factor 1, Factor 2, and Factor 3 consecutively are 24.29%, 18.73%, and 15.95%. The last criteria used to compare PC and ML is using constructed correlation matrices. The more similar constructed correlation matrices to R will be the better extraction methods among them. But this is also equal in stating the bigger p-value in testing the equality of the constructed matrix and the original one, will be the better extraction method. Testing the equality of Correlation matrices has been proposed by several authors (Brien, 1988;Grover, Martha A.;Barthe, Stephanie C.;Rousseau, 2009;Ho et al., 2001;Modarres & Jernigan, 1992;Taylor & Jennrich, 2012;White et al., 2006).
The (R) is that the chi-square statistic = 14.23 with degrees of freedom = 45 and p-value = 0.99999. With these criteria, therefore, we prefer ML extraction method than PC extraction method.

CONCLUSION
Principal component factoring acquisition is greater than maximum likelihood factoring on cumulative proportion of the total sample variance explained. This is normal because this criterion typically favors principal component factoring. Relation between principal component and factor analysis is obtaining loadings, which have, by design, a variance optimizing property (Bugli & Lambert, 2007). PC extraction method is preferred in terms of cumulative proportion of variance sampling and communalities, while ML extraction method is better in terms of constructed correlation matrices than PC extraction method.