The Vertical Pod Autoscaler (VPA) in Kubernetes automatically adjusts the CPU and memory requests and limits of the containers within a pod based on historical and real-time resource usage. In this scenario, where a single-replica stateful application needs more CPU during peak times, VPA can dynamically increase the CPU allocated to the pod when needed and potentially decrease it during off-peak periods to optimize resource utilization and cost efficiency.
Option A: Cluster autoscaling adds or removes nodes in your GKE cluster based on the resource requests of your pods. While it can help with overall cluster capacity, it oesn't directly address the need for more CPU for a specific pod.
Option C: Horizontal Pod Autoscaler (HPA) scales the number of pod replicas based on observed CPU utilization or other select metrics. Since the application can only have one replica, HPA is not suitable.
Option D: Node auto-provisioning is similar to cluster autoscaling, automatically creating and deleting node pools based on workload demands. It doesn't directly manage the resources of individual pods.
Reference to Google Cloud Certified - Associate Cloud Engineer Documents:
The functionality and use cases of the Vertical Pod Autoscaler (VPA) are detailed in the Google Kubernetes Engine documentation, specifically within the resource management and autoscaling sections. Understanding how VPA can dynamically adjust pod resources is relevant to the Associate Cloud Engineer certification.
UESTION NO: 48 of 50
(You host your website on Compute Engine. The number of global users visiting your website is rapidly expanding. You need to minimize latency and support user growth in multiple geographical regions. You also want to follow Google-recommended practices and minimize operational costs. Which two actions should you take?
Choose 2 answers)
A. Deploy all of your VMs in a single Google Cloud region with the largest available CIDR range.
B. Deploy your VMs in multiple Google Cloud regions closest to your users’ geographical locations.
C. Use an external Application Load Balancer in Regional mode.
D. Use an external Application Load Balancer in Global mode.
E. Use a Network Load Balancer.
Answer: BD
To minimize latency for a global user base, it's crucial to serve users from regions geographically close to them. Deploying VMs in multiple Google Cloud regions (Option B) achieves this by reducing the network distance and thus the round-trip time for requests.
To support user growth and provide a single point of entry with global reach, a global external Application Load Balancer (Option D) is the recommended choice for web applications. It distributes traffic to backend instances across multiple regions based on user proximity, capacity,and health. Application Load Balancers also offer features like SSL termination, content-based routing, and security policies, which are important for modern web applications.
* Option A: Deploying in a single region, regardless of the CIDR range, will result in high latency for users far from that region.
* Option C: A regional external Application Load Balancer only distributes traffic within a single region, not across multiple global regions, thus not effectively minimizing latency for all global users.
* Option E: Network Load Balancers operate at Layer 4 and don't offer the application-level routing and features of an Application Load Balancer, which are generally preferred for web applications. While they can be global, Application Load Balancers are better suited for this scenario.
Reference to Google Cloud Certified - Associate Cloud Engineer Documents:
The concepts of multi-region deployments for low latency and the use of global load balancers (specifically Application Load Balancers for web traffic) for global reach and traffic management are core topics in the Compute Engine and Load Balancing sections of the Google Cloud documentation, which are essential for the Associate Cloud Engineer certification. The best practices for global application deployment are emphasized.