Principal Component Analysis for Clustering Stock Portfolios

Authors: Taylor Agarwal (University of Arizona) , Henk Quelle (University of Arizona) , Cooper Ryan (University of Arizona)

  • Principal Component Analysis for Clustering Stock Portfolios


    Principal Component Analysis for Clustering Stock Portfolios

    Authors: , ,


Motivation: Due to the sheer size of the stock market and its dependencies on several factors, a catch-all method of determining stock performance continues to elude the public. Recent developments in machine learning have opened the door to new possibilities for a predictive algorithm. Principal Component Analysis (PCA) is one such tool that has allowed for the discovery of hidden interconnectivity in large data sets. We use PCA alongside k-means clustering to obtain groups of stocks with similar historical structure with the potential to assist in predictive stock management.

Results: In just over five seconds, our algorithm groups a hundred stocks from the New York Stock Exchange with mean correlation 0.2483 into fifty portfolios with mean correlation 0.4299. On average, the stocks in these fifty portfolios experienced price increases and decreases on the same day 65.34% of the time in the sample time frame of 4/17/2000 to 11/10/2017. The algorithm can be extended to encompass more stocks.

Implications: This algorithm provides a means to identify stocks with similar structure in both the short and long term. Stocks belonging to the same portfolio after clustering are shown to have positive and negative returns at the same time within the user defined periods of time.

How to Cite:

Agarwal, T. & Quelle, H. & Ryan, C., (2021) “Principal Component Analysis for Clustering Stock Portfolios”, Arizona Journal of Interdisciplinary Studies 7, p.64-75.



Published on
07 May 2021
Peer Reviewed