Portfolio Optimization Using Deep Reinforcement Learning Based on Modern Portfolio Theory
Keywords:
Levenberg-Marquardt algorithm, deep reinforcement learning, modern portfolio theory, stock portfolio optimizationAbstract
This study investigates the application of deep reinforcement learning in optimizing investment portfolios and integrates it with modern portfolio theory, illustrating significant advancements in financial management strategies. While modern portfolio theory is recognized as a mathematical framework aimed at maximizing expected returns while considering risk, its limitations—such as assumptions regarding the normal distribution of returns and the neglect of transaction costs—clearly highlight the need for adaptable solutions in complex and dynamic financial markets. This research demonstrates that through the implementation of deep reinforcement learning, investors can leverage real-time data and dynamic decision-making capabilities to develop more efficient and robust investment strategies. Moreover, challenges such as data quality, computational complexity, and the interpretability of deep reinforcement learning models are thoroughly examined. In this study, a Levenberg–Marquardt neural network algorithm for deep reinforcement learning is proposed for portfolio optimization based on historical data. For this purpose, data from 10 highly liquid companies listed on the Tehran Stock Exchange during the period from 2011 to 2021 were utilized. The findings of this study indicate that reinforcement learning algorithms can lead to a 15% increase in cumulative return compared to traditional methods in portfolio selection. Furthermore, the article recommends that analysts and investors employ advanced techniques to enhance corporate performance stability, enabling better investment decisions. Finally, this research offers practical pathways for future investigations in this domain.
Downloads
References
Almahdi, S., & Yang, S. Y. (2017). An adaptive portfolio trading system: A risk-return portfolio optimization using recurrent reinforcement learning with expected maximum drawdown. Neurocomputing, 207, 279-290. https://doi.org/10.1016/j.neucom.2016.05.005
Ayari Salah, G. H. (2025). A meta-analysis of supervised and unsupervised machine learning algorithms and their application to active portfolio management. Expert Systems with Applications, 271, 126611. https://doi.org/10.1016/j.eswa.2025.126611
Benhamou, E., Saltiel, D., Ungari, S., & Mukhopadhyay, A. (2020). Bridging the gap between Markowitz planning and deep reinforcement learning. https://doi.org/10.2139/ssrn.3702112
Chaher, A. (2025). Optimizing portfolio selection through stock ranking and matching: A reinforcement learning approach. Expert Systems with Applications, 269, 126430. https://doi.org/10.1016/j.eswa.2025.126430
Chaouki, A., Hardiman, S., Schmidt, C., Sérié, E., & De Lataillade, J. (2020). Deep deterministic portfolio optimization. The Journal of Finance and Data Science, 6, 16-30. https://doi.org/10.1016/j.jfds.2020.06.002
Cong, L. W., Tang, K., Wang, J., & Zhang, Y. (2021). AlphaPortfolio: Direct construction through deep reinforcement learning and interpretable AI. Available at SSRN, 3554486
Deng, Y., Bao, F., Kong, Y., Ren, Z., & Dai, Q. (2016). Deep direct reinforcement learning for financial signal representation and trading. IEEE Transactions on Neural Networks and Learning Systems, 28(3), 653-664. https://doi.org/10.1109/TNNLS.2016.2522401
Goodell, J. W., Kumar, S., Lim, W. M., & Pattnaik, D. (2021). Artificial intelligence and machine learning in finance: Identifying foundations, themes, and research clusters from bibliometric analysis. Journal of Behavioral and Experimental Finance, 32, 100577. https://doi.org/10.1016/j.jbef.2021.100577
Henriques, I., & Sadorsky, P. (2023). Forecasting NFT coin prices using machine learning: Insights into feature significance and portfolio strategies. Global Finance Journal, 58, 100904. https://doi.org/10.1016/j.gfj.2023.100904
Heydari, M. S., Validi, J., & Ebrahimi, S. B. (2021). Portfolio Optimization Based on a Robust Possibilistic Programming Model Using Genetic Algorithms and Mixed Frog Leaping Algorithm. Financial Engineering and Securities Management, 12(47).
Jiang, Z., Xu, D., & Liang, J. (2017). A deep reinforcement learning framework for the financial portfolio management problem. arXiv preprint arXiv:1706. 10059
Jyotirmayee Behera, P. K. (2025). An approach to portfolio optimization with time series forecasting algorithms and machine learning techniques. Applied Soft Computing, 170, 112741. https://doi.org/10.1016/j.asoc.2025.112741
Koratamaddi, P., Wadhwani, K., Gupta, M., & Sanjeevi, S. G. (2021). Market sentiment-aware deep reinforcement learning approach for stock portfolio allocation. Engineering Science and Technology, an International Journal, 24(4), 848-859. https://doi.org/10.1016/j.jestch.2021.01.007
Liang, Z., Chen, H., Zhu, J., Jiang, K., & Li, Y. (2018). Adversarial deep reinforcement learning in portfolio management. arXiv preprint arXiv:1808. 09940
Lima Paiva, F. C., Felizardo, L. K., Bianchi, R. A. d. C., & Costa, A. H. R. (2021). Intelligent trading systems: a sentiment-aware reinforcement learning approach. https://doi.org/10.1145/3490354.3494445
Liu, X. Y., Yang, H., Chen, Q., Zhang, R., Yang, L., Xiao, B., & Wang, C. D. (2020). FinRL: A deep reinforcement learning library for automated stock trading in quantitative finance. https://doi.org/10.2139/ssrn.3737257
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., & Bellemare, M. G. (2015). Human-level control through deep reinforcement learning. Nature, 518(7540), 529-533. https://doi.org/10.1038/nature14236
Mohammadi Baghmalaei, H., Parsa, H., Tahmasebi, S., & Hajiani, P. (2021). Application of Cumulative Entropy Criterion and PSO Algorithm in Portfolio Optimization of Petrochemical Companies in Tehran Stock Exchange. Biannual Journal of Development and Investment, 6(2).
Moshrefi, M., & Behnamian, J. (2022). Multi-Objective Portfolio Optimization Using Hierarchical Analysis and Genetic Algorithm. Biannual Journal of Engineering Management and Soft Computing, 7(1).
Mostafaei Darmian, S., & Doaei, M. (2021). A Stochastic Optimization-Based Approach for Solving the Portfolio Selection Problem in Iran's Capital Market Using Metaheuristic Algorithms. Quarterly Journal of Applied Economic Theories, 8(4).
Sun, S., Wang, R., He, X., Zhu, J., Li, J., & An, B. (2021). Deepscalper: A risk-aware deep reinforcement learning framework for intraday trading with micro-level market embedding. arXiv preprint arXiv:2201. 09058
Sutton, R. S., & Barto, A. G. P. (2018). Reinforcement learning: An introduction. MIT Press.
Yifu, J., Jose, O., & Majed, A. (2024). Deep reinforcement learning for portfolio selection. Global Finance Journal, 62, 101016. https://doi.org/10.1016/j.gfj.2024.101016
Downloads
Published
Submitted
Revised
Accepted
Issue
Section
License
Copyright (c) 2025 Yaser Rezaei Piteh Novi, Mehdi Safari Graili, Mohammad Taghi Kabiri, Meysam Arabzadeh (Author)

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.