Adaptive and Secure IAM Policy Optimization in AWS Using Reinforcement Learning

Authors

  • abdualrahman mohammed Talib Department of Computer Science, College of Science, University of Diyala, Diyala, Iraq
  • Ghassan Sabeeh Mahmood Department of Computer Science, College of Science, University of Diyala, Diyala, Iraq
  • Hazim Noman Abed College of Graduate Studies, Universiti Tenaga Nasional (UNITEN), Malaysia

DOI:

https://doi.org/10.29304/jqcsm.2026.18.22654

Keywords:

Cloud Security, Reinforcement Learning, Proximal Policy Optimization (PPO), Adaptive Policy Management

Abstract

Cloud computing systems like The use of cloud computing environments like Amazon Web Services (AWS) requires setting up of accurate and dependable Identity and Access Management (IAM) policies that can ensure the secure and uninterrupted functioning of the service. Traditional inflexible IAM management practices are not very flexible to evolving workloads, and learning-based practices often breed uncertainty and system risk.In this research, a flexible IAM policy optimization framework has been suggested based on a reinforcement learning method, namely Proximal Policy Optimization (PPO) in combination with deterministic safety guardrails to secure business continuity. The framework was stringently tested with 1,378 real IAM policy files and 23, 125 real CloudTrail logs that were obtained in a managed AWS environment.Empirical findings provide that the proposed methodology has a 96.5% precision in the identification of high-risk permissions, a 100 per cent recall among essential services, and lessens unnecessary privileges by 78.9 per cent as measured by the Least Privilege Reduction Score (LPRS). These results support the view that adaptive IAM optimization is sufficiently safe to run in production cloud in the presence of deterministic safety enforcements.

Downloads

Download data is not yet available.

Author Biographies

Ghassan Sabeeh Mahmood, Department of Computer Science, College of Science, University of Diyala, Diyala, Iraq

 

Ghassan Sabeeh Mahmood is a faculty member at the Department of Computer Science, University of Diyala, Iraq. His research interests include cloud computing, cybersecurity, and intelligent systems.

Hazim Noman Abed, College of Graduate Studies, Universiti Tenaga Nasional (UNITEN), Malaysia

 

Hazim Noman Abed is a postgraduate researcher at Universiti Tenaga Nasional (UNITEN), Malaysia. His research focuses on cloud security, reinforcement learning applications, and secure distributed systems.

References

D’Antoni, L., Ding, S., Goel, A., Ramesh, M., Rungta, N., & Sung, C. (2024). Automatically reducing privilege for access control policies. Proceedings of the ACM on Programming Languages, 8(OOPSLA2).

Amazon Web Services. (2026). IAM Policy Evaluation Logic. Amazon Web Services Documentation.

N2WS. (2025). 49 cloud computing statistics you must know in 2025. N2WS Blog.

Check Point & DuploCloud. (2024). Cloud security report: Misconfigurations and limited visibility plague enterprises. DuploCloud Blog.

Lu, H., Lin, J., Zhou, Y., & Wang, X. (2025). Uncovering cloud access risks under real-world IAM practices. Proceedings on Privacy Enhancing Technologies, 2025(2).

Horizon3.ai. (2023). AWS misconfiguration leads to buckets of data. Horizon3.ai Attack Research.

Eiers, W., Sankaran, G., & Bultan, T. (2023). Quantitative policy repair for access control on the cloud. In Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA) (pp. 564–575).

National Institute of Standards and Technology (NIST). (2020). Zero Trust Architecture (NIST Special Publication 800-207).

Soveizi, N., & Karastoyanova, D. (2025). Reinforcement learning–driven adaptation chains: A robust framework for multi-cloud workflow security. arXiv preprint arXiv:2501.06305.

Aref, Z., Wei, S., & Mandayam, N. B. (2025). Human–AI collaboration in cloud security: Cognitive hierarchy-driven deep reinforcement learning. arXiv preprint arXiv:2502.16054.

Mahmood, G. S., Hasan, N., Abed, H. N., & Jalil, B. A. (2022). An efficient and secure auditing system of cloud storage based on BLS signature. International Journal of Computing and Digital Systems, 12(7), 1491–1501.

Saqib, M., & Mehta, D. (2025). Adaptive security policy management in cloud environments using reinforcement learning. arXiv preprint arXiv:2505.08837.

Schulman, J., Wolski, F., Dhariwal, P., Radford, A., & Klimov, O. (2017). Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347.

Sutton, R. S., & Barto, A. G. (2018). Reinforcement Learning: An Introduction (2nd ed.). MIT Press.

Schulman, J., Moritz, P., Levine, S., Jordan, M., & Abbeel, P. (2016). High-dimensional continuous control using generalized advantage estimation. Proceedings of the International Conference on Learning Representations (ICLR).

Kingma, D. P., & Ba, J. (2015). Adam: A method for stochastic optimization. Proceedings of the International Conference on Learning Representations (ICLR).

Amazon Web Services. (2026). IAM Policy Simulator. Amazon Web Services Documentation.

Downloads

Published

2026-06-27

How to Cite

mohammed Talib, abdualrahman, Mahmood, G. S., & Abed, H. N. (2026). Adaptive and Secure IAM Policy Optimization in AWS Using Reinforcement Learning. Journal of Al-Qadisiyah for Computer Science and Mathematics, 18(2), Comp 168–182. https://doi.org/10.29304/jqcsm.2026.18.22654

Issue

Section

Computer Articles