AI-Driven Resource Scaling in Cloud Computing for Unpredictable Workloads

TLDR

An intelligent resource-scaling model using attention mechanisms and Transformer architecture to predict future workload demands and dynamically adjust resources in large-scale distributed systems is proposed, demonstrating its potential to optimize cloud resources, enhance scalability, and reduce costs.

Resumen

In cloud computing environments, the dynamic nature of workloads presents significant challenges in resource allocation and scaling. This paper proposes an intelligent resource-scaling model using attention mechanisms and Transformer architecture to predict future workload demands and dynamically adjust resources in large-scale distributed systems. The model utilizes historical workload data, including CPU, memory, disk, and network usage, and applies advanced preprocessing techniques, such as normalization, missing value imputation, and time-series representation, to ensure high-quality inputs for the predictive model. The core of the model is based on self-attention mechanisms, multi-head attention, and positional encoding to capture temporal dependencies and long-term relationships in the data. The Transformer-based model predicts resource demands at future time steps, and a dynamic resource allocation algorithm uses these predictions to scale resources efficiently. The proposed approach is validated through simulations, demonstrating its potential to optimize cloud resources, enhance scalability, and reduce costs. Results show that the model outperforms traditional approaches, with a 15% reduction in resource wastage and a 20% improvement in resource utilization. This model can be applied to a variety of cloud services, improving performance and cost-efficiency in distributed computing environments.