When
considering all six variables, the Correlation plot that is derived appears to
be inconsistent, with some variables showing minimum to no correlation. After
observing this behavior, we can conclude that some variables are negligible and
we extract a Pareto Chart from MATLAB to deduce how many principal components
should be considered in the analysis.
The Pareto chart informs us that the first four variables
are sufficient in achieving close to 100% variance and therefore we neglect two
of the variables and shift our attention to the four principal components for
Principal Component Analysis (PCA).
A biplot is a type of plot that combines information from both the observations (rows) and the variables (columns) in a multivariate dataset. It is particularly useful for visualizing relationships and patterns in high-dimensional data. The biplot displays points for each observation and vectors for each variable. The position of an observation point relative to a variable vector provides insights into the relationships between variables and observations. The length and direction of the vectors indicate the strength and direction of the variable's influence.
Once Principal Component Analysis is conducted and the four principal components are analysed, the correlation plot takes a more convincing structure and shows consistent correlation between the variables.
Author: Shaad Akbar & Andrew Clinton
No comments:
Post a Comment