{"id":9548,"date":"2022-09-30T14:28:20","date_gmt":"2022-09-30T07:28:20","guid":{"rendered":"https:\/\/gcloudvn.com\/?p=9548"},"modified":"2023-07-12T14:18:18","modified_gmt":"2023-07-12T07:18:18","slug":"introducing-kubernetes-control-plane-metrics-in-gke","status":"publish","type":"post","link":"https:\/\/gcloudvn.com\/en\/kienthuc\/introducing-kubernetes-control-plane-metrics-in-gke\/","title":{"rendered":"Introducing Kubernetes Control Plane metrics in GKE"},"content":{"rendered":"<p style=\"text-align: justify;\"><span style=\"font-weight: 400;\">M\u1ed9t kh\u00eda c\u1ea1nh thi\u1ebft y\u1ebfu c\u1ee7a vi\u1ec7c v\u1eadn h\u00e0nh b\u1ea5t k\u1ef3 \u1ee9ng d\u1ee5ng n\u00e0o l\u00e0 kh\u1ea3 n\u0103ng quan s\u00e1t t\u00ecnh tr\u1ea1ng v\u00e0 hi\u1ec7u su\u1ea5t c\u1ee7a \u1ee9ng d\u1ee5ng \u0111\u00f3 c\u0169ng nh\u01b0 c\u1ee7a c\u01a1 s\u1edf h\u1ea1 t\u1ea7ng b\u00ean d\u01b0\u1edbi \u0111\u1ec3 nhanh ch\u00f3ng gi\u1ea3i quy\u1ebft c\u00e1c v\u1ea5n \u0111\u1ec1 khi ch\u00fang ph\u00e1t sinh. <a href=\"https:\/\/gcloudvn.com\/en\/google-kubernetes-engine-gke\/\">Google Kubernetes Engine<\/a> (GKE) \u0111\u00e3 cung c\u1ea5p audit logs, operations logs v\u00e0 ch\u1ec9 s\u1ed1 c\u00f9ng v\u1edbi dashboards c\u00f3 s\u1eb5n v\u00e0 b\u00e1o c\u00e1o l\u1ed7i t\u1ef1 \u0111\u1ed9ng \u0111\u1ec3 t\u1ea1o \u0111i\u1ec1u ki\u1ec7n ch\u1ea1y c\u00e1c \u1ee9ng d\u1ee5ng \u0111\u00e1ng tin c\u1eady tr\u00ean quy m\u00f4 l\u1edbn. S\u1eed d\u1ee5ng c\u00e1c nh\u1eadt k\u00fd v\u00e0 ch\u1ec9 s\u1ed1 n\u00e0y, Ho\u1ea1t \u0111\u1ed9ng tr\u00ean n\u1ec1n t\u1ea3ng \u0111\u00e1m m\u00e2y cung c\u1ea5p c\u00e1c c\u1ea3nh b\u00e1o, trang t\u1ed5ng quan gi\u00e1m s\u00e1t v\u00e0 Tr\u00ecnh kh\u00e1m ph\u00e1 nh\u1eadt k\u00fd \u0111\u1ec3 nhanh ch\u00f3ng ph\u00e1t hi\u1ec7n, kh\u1eafc ph\u1ee5c s\u1ef1 c\u1ed1 v\u00e0 gi\u1ea3i quy\u1ebft c\u00e1c v\u1ea5n \u0111\u1ec1.<\/span><\/p>\n<div id=\"ez-toc-container\" class=\"ez-toc-v2_0_80 counter-hierarchy ez-toc-counter ez-toc-grey ez-toc-container-direction\">\n<div class=\"ez-toc-title-container\">\n<p class=\"ez-toc-title\" style=\"cursor:inherit\">Table of contents<\/p>\n<span class=\"ez-toc-title-toggle\"><a href=\"#\" class=\"ez-toc-pull-right ez-toc-btn ez-toc-btn-xs ez-toc-btn-default ez-toc-toggle\" aria-label=\"Toggle Table of Content\"><span class=\"ez-toc-js-icon-con\"><span class=\"\"><span class=\"eztoc-hide\" style=\"display:none;\">Toggle<\/span><span class=\"ez-toc-icon-toggle-span\"><svg style=\"fill: #999;color:#999\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" class=\"list-377408\" width=\"20px\" height=\"20px\" viewbox=\"0 0 24 24\" fill=\"none\"><path d=\"M6 6H4v2h2V6zm14 0H8v2h12V6zM4 11h2v2H4v-2zm16 0H8v2h12v-2zM4 16h2v2H4v-2zm16 0H8v2h12v-2z\" fill=\"currentColor\"><\/path><\/svg><svg style=\"fill: #999;color:#999\" class=\"arrow-unsorted-368013\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"10px\" height=\"10px\" viewbox=\"0 0 24 24\" version=\"1.2\" baseprofile=\"tiny\"><path d=\"M18.2 9.3l-6.2-6.3-6.2 6.3c-.2.2-.3.4-.3.7s.1.5.3.7c.2.2.4.3.7.3h11c.3 0 .5-.1.7-.3.2-.2.3-.5.3-.7s-.1-.5-.3-.7zM5.8 14.7l6.2 6.3 6.2-6.3c.2-.2.3-.5.3-.7s-.1-.5-.3-.7c-.2-.2-.4-.3-.7-.3h-11c-.3 0-.5.1-.7.3-.2.2-.3.5-.3.7s.1.5.3.7z\"\/><\/svg><\/span><\/span><\/span><\/a><\/span><\/div>\n<nav><ul class='ez-toc-list ez-toc-list-level-1' ><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-1\" href=\"https:\/\/gcloudvn.com\/en\/kienthuc\/introducing-kubernetes-control-plane-metrics-in-gke\/#Gioi_thieu_so_lieu_Control_Plane_Kubernetes_va_ly_do_chung_quan_trong\" >Introducing Kubernetes control plane metrics and why they matter<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-2\" href=\"https:\/\/gcloudvn.com\/en\/kienthuc\/introducing-kubernetes-control-plane-metrics-in-gke\/#Hien_thi\" >Displayed in context<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-3\" href=\"https:\/\/gcloudvn.com\/en\/kienthuc\/introducing-kubernetes-control-plane-metrics-in-gke\/#PromQL_compatible\" >PromQL compatible<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-4\" href=\"https:\/\/gcloudvn.com\/en\/kienthuc\/introducing-kubernetes-control-plane-metrics-in-gke\/#Ho_tro_cua_ben_thu_ba\" >Third-party support<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-5\" href=\"https:\/\/gcloudvn.com\/en\/kienthuc\/introducing-kubernetes-control-plane-metrics-in-gke\/#Gia_ca\" >Pricing<\/a><\/li><\/ul><\/nav><\/div>\n<h2 style=\"text-align: justify;\"><span class=\"ez-toc-section\" id=\"Gioi_thieu_so_lieu_Control_Plane_Kubernetes_va_ly_do_chung_quan_trong\"><\/span><b>Introducing Kubernetes control plane metrics and why they matter<\/b><span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p style=\"text-align: justify;\"><span style=\"font-weight: 400;\">In addition to these existing sources of telemetry data, we are excited to announce that we are now exposing Kubernetes control plane metrics, which are now Generally Available. With GKE, Google fully manages the Kubernetes control plane; however, when troubleshooting issues it can be helpful to have access to certain metrics emitted by the Kubernetes control plane.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-weight: 400;\">As part of our vision to make Kubernetes easier to use and easier to operate, these control plane metrics are directly integrated with Cloud Monitoring, so you don't need to manage any metric collection or scrape config.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-weight: 400;\">Example: To understand the health of an API server, you can use metrics like apiserver_request_total and apiserver_request_duration_seconds to track the load that the API Server is experiencing, a fraction of API Server requests that return errors, and response latency response to requests received by the API Server. Also, apiserver_storage_objects can be very useful for monitoring the saturation of the Server API, especially if you are using a custom controller. Break this metric by resource label to find out which resources or Kubernetes custom controllers are problematic.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-weight: 400;\">When a pod is created it is initially placed in a \"pending\" state, indicating it hasn't yet been scheduled on a node. In a healthy cluster, pending pods are relatively quickly scheduled on a node, providing the workload the resources it needs to run. However, a sustained increase in the number of pending pods may indicate a problem scheduling those pods, which may be caused by insufficient resources or inappropriate configuration. Metrics like scheduler_pending_pods, scheduler_schedule_attempts_total, scheduler_preemption_attempts_total, scheduler_preemption_victims , and scheduler_scheduling_attempt_duration_seconds can alert you to potential scheduling issues, so you can act quickly to ensure sufficient resources are available for your pods. Using these metrics in combination will help you better understand the health of your cluster. For instance, if scheduler_preemption_attempts_total goes up, it means that there are higher priority pods available to be scheduled and the Scheduler is preempting some running pods. However, if the value of scheduler_pending_pods is also increasing, this may indicate that you don\u2019t have enough resources to allocate the higher priority pods.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-weight: 400;\">If the Kubernetes scheduler is still unable to find a suitable node for a pod, then the pod will eventually be marked as unschedulable. Kubernetes control plane metrics provide you visibility into pod scheduling errors and unschedulable pods. A spike in either means that the Kubernetes scheduler isn't able to find an appropriate node on which to run many of your pods, which may ultimately impair the performance of your application. In many cases, a high rate of unschedulable pods will not resolve itself until you take some action to address the underlying cause. A good first place to start troubleshooting the issue is to look for recent FailedScheduling events. (If you have GKE system logs enabled, then all Kubernetes events are available in Cloud Logging.) These FailedScheduling events include a message (for instance, \"0\/6 nodes are available: 6 Insufficient cpu.\") that very helpfully describes exactly why the pod wasn't able to be scheduled on any nodes, giving you guidance on how to address the problem.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-weight: 400;\">A final example: If you see scheduling jobs is very slow, then one possible cause is that a third-party webhooks might be introducing significant latency, causing the API server to take a long time to schedule a job. Kubernetes control plane metrics such as apiserver_admission_webhook_admission_duration_seconds can expose the admission webhook latency, helping you identify the root cause of slow job scheduling and mitigate the issue.<\/span><\/p>\n<h2 style=\"text-align: justify;\"><span class=\"ez-toc-section\" id=\"Hien_thi\"><\/span><b>Displayed in context<\/b><span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p style=\"text-align: justify;\"><span style=\"font-weight: 400;\">Not only are we making these additional Kubernetes control plane metrics available, we're also excited to announce that all of these metrics are displayed in the Kubernetes Engine section of the Cloud Console, making it easy to identify and investigate issues in-context as you're managing your GKE clusters.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-weight: 400;\">To view these control plane metrics, go to the Kubernetes clusters section of the Cloud Console, select the \"Observability\" tab, and select \"Control plane\":<\/span><\/p>\n<p style=\"text-align: justify;\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-9540\" src=\"https:\/\/gcloudvn.com\/wp-content\/uploads\/2022\/09\/control_plane_metrics_screenshot.max-2200x2200-1-1024x499.png\" alt=\"Control Plane Kubernetes metrics in Google Kubernetes Engine (GKE) 1\" width=\"600\" height=\"292\" srcset=\"https:\/\/gcloudvn.com\/wp-content\/uploads\/2022\/09\/control_plane_metrics_screenshot.max-2200x2200-1-1024x499.png 1024w, https:\/\/gcloudvn.com\/wp-content\/uploads\/2022\/09\/control_plane_metrics_screenshot.max-2200x2200-1-300x146.png 300w, https:\/\/gcloudvn.com\/wp-content\/uploads\/2022\/09\/control_plane_metrics_screenshot.max-2200x2200-1-768x374.png 768w, https:\/\/gcloudvn.com\/wp-content\/uploads\/2022\/09\/control_plane_metrics_screenshot.max-2200x2200-1-1536x748.png 1536w, https:\/\/gcloudvn.com\/wp-content\/uploads\/2022\/09\/control_plane_metrics_screenshot.max-2200x2200-1-2048x997.png 2048w, https:\/\/gcloudvn.com\/wp-content\/uploads\/2022\/09\/control_plane_metrics_screenshot.max-2200x2200-1-18x9.png 18w\" sizes=\"auto, (max-width: 600px) 100vw, 600px\" \/><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-weight: 400;\">Since all Kubernetes control plane metrics are ingested into Cloud Monitoring, you can create alerting policies in Cloud Alerting so you're notified as soon as something needs your attention.<\/span><\/p>\n<h2 style=\"text-align: justify;\"><span class=\"ez-toc-section\" id=\"PromQL_compatible\"><\/span><b>PromQL compatible<\/b><span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p style=\"text-align: justify;\"><span style=\"font-weight: 400;\">When you enable Kubernetes control plane metrics for your GKE clusters, all metrics are collected using Google Cloud Managed Service for Prometheus. This means the metrics are sent to Cloud Monitoring in the same GCP project as your Kubernetes cluster and can be queried using PromQL via the Cloud Monitoring API and Metrics explorer.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-weight: 400;\">For example, you can monitor any spike in 99th percentile API server response latency using this PromQL query:<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-weight: 400;\">sum by (instance, verb) (histogram_quantile (0.99, rate (apiserver_request_duration_seconds_bucket {cluster = \u201ccluster-name\u201d} [5m]))) )<\/span><\/p>\n<h2 style=\"text-align: justify;\"><span class=\"ez-toc-section\" id=\"Ho_tro_cua_ben_thu_ba\"><\/span><b>Third-party support<\/b><span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p style=\"text-align: justify;\"><span style=\"font-weight: 400;\">If you monitor your GKE cluster using popular third party observability tools, any third party observability tool can ingest these Kubernetes control plane metrics using the Cloud Monitoring API.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-weight: 400;\">Example: If you are a Datadog customer and you have Kubernetes control plane metrics enabled for your GKE cluster, then Datadog provides enhanced visualizations including Kubernetes Control plane metrics from the API server, schedule and controller manager.<\/span><\/p>\n<h2 style=\"text-align: justify;\"><span class=\"ez-toc-section\" id=\"Gia_ca\"><\/span><b>Pricing<\/b><span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p style=\"text-align: justify;\"><span style=\"font-weight: 400;\">All Kubernetes control plane metrics are charged at the standard price for metrics ingested from Google Cloud Managed Service for Prometheus.<\/span><\/p>\n<p style=\"text-align: justify;\">If your business is interested in the\u00a0<a href=\"https:\/\/gcloudvn.com\/en\/google-cloud-platform\/\">Google Cloud<\/a>\u00a0Platform then you can connect to Gimasys - Google Premier Partner - for consulting solutions according to the unique needs of your business. Contact now:<\/p>\n<ul style=\"text-align: justify;\">\n<li aria-level=\"1\"><b>Gimasys \u2013 Google Cloud Premier Partner<\/b><\/li>\n<li aria-level=\"1\"><b>Hotline:\u00a0<\/b>Ha Noi:\u00a00987 682 505\u00a0\u2013 Ho Chi Minh:\u00a00974 417 099<\/li>\n<li aria-level=\"1\"><b>Email:\u00a0<\/b>gcp@gimasys.com<\/li>\n<\/ul>\n<p style=\"text-align: right;\"><strong>Source: <a href=\"https:\/\/gcloudvn.com\/en\/\">Gimasys<\/a><\/strong><\/p>","protected":false},"excerpt":{"rendered":"<p>M\u1ed9t kh\u00eda c\u1ea1nh thi\u1ebft y\u1ebfu c\u1ee7a vi\u1ec7c v\u1eadn h\u00e0nh b\u1ea5t k\u1ef3 \u1ee9ng d\u1ee5ng n\u00e0o l\u00e0 kh\u1ea3 n\u0103ng quan s\u00e1t t\u00ecnh tr\u1ea1ng v\u00e0 hi\u1ec7u su\u1ea5t c\u1ee7a \u1ee9ng d\u1ee5ng \u0111\u00f3 c\u0169ng nh\u01b0 c\u1ee7a c\u01a1 s\u1edf h\u1ea1 t\u1ea7ng b\u00ean d\u01b0\u1edbi \u0111\u1ec3 nhanh ch\u00f3ng&hellip;<\/p>","protected":false},"author":2,"featured_media":9544,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"inline_featured_image":false,"footnotes":""},"categories":[1,135],"tags":[],"class_list":["post-9548","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-kienthuc","category-google-cloud-platform","entry","has-media"],"_links":{"self":[{"href":"https:\/\/gcloudvn.com\/en\/wp-json\/wp\/v2\/posts\/9548","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/gcloudvn.com\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/gcloudvn.com\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/gcloudvn.com\/en\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/gcloudvn.com\/en\/wp-json\/wp\/v2\/comments?post=9548"}],"version-history":[{"count":0,"href":"https:\/\/gcloudvn.com\/en\/wp-json\/wp\/v2\/posts\/9548\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/gcloudvn.com\/en\/wp-json\/wp\/v2\/media\/9544"}],"wp:attachment":[{"href":"https:\/\/gcloudvn.com\/en\/wp-json\/wp\/v2\/media?parent=9548"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/gcloudvn.com\/en\/wp-json\/wp\/v2\/categories?post=9548"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/gcloudvn.com\/en\/wp-json\/wp\/v2\/tags?post=9548"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}