{"id":10793,"date":"2023-02-16T15:57:19","date_gmt":"2023-02-16T08:57:19","guid":{"rendered":"https:\/\/gcloudvn.com\/?p=10793"},"modified":"2023-09-13T16:24:06","modified_gmt":"2023-09-13T09:24:06","slug":"automate-data-governance-extend-your-data-fabric-with-dataplex-biglake-integration","status":"publish","type":"post","link":"https:\/\/gcloudvn.com\/en\/kienthuc\/automate-data-governance-extend-your-data-fabric-with-dataplex-biglake-integration\/","title":{"rendered":"Automate data governance, extend your data fabric with Dataplex-BigLake integration"},"content":{"rendered":"<p style=\"text-align: justify;\"><span style=\"font-weight: 400;\">Unlocking the full potential of data requires breaking down the silo between open-source data formats and data warehouses. At the same time, it is critical to enable <\/span><span style=\"font-weight: 400;\">data governance<\/span><span style=\"font-weight: 400;\"> team to apply policies regardless of where the data happens, whether - on file  or columnar storage.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-weight: 400;\">Ng\u00e0y nay, c\u00e1c nh\u00f3m qu\u1ea3n tr\u1ecb d\u1eef li\u1ec7u ph\u1ea3i tr\u1edf th\u00e0nh chuy\u00ean gia v\u1ec1 ch\u1ee7 \u0111\u1ec1 tr\u00ean m\u1ed7i h\u1ec7 th\u1ed1ng l\u01b0u tr\u1eef m\u00e0 d\u1eef li\u1ec7u c\u1ee7a c\u00f4ng ty n\u1eb1m tr\u00ean \u0111\u00f3. K\u1ec3 t\u1eeb th\u00e1ng 2 n\u0103m 2022, Google Dataplex \u0111\u00e3 cung c\u1ea5p m\u1ed9t n\u01a1i th\u1ed1ng nh\u1ea5t \u0111\u1ec3 \u00e1p d\u1ee5ng c\u00e1c ch\u00ednh s\u00e1ch, \u0111\u01b0\u1ee3c ph\u1ed5 bi\u1ebfn tr\u00ean c\u1ea3 kho l\u01b0u tr\u1eef th\u00f4 v\u00e0 kho d\u1eef li\u1ec7u trong <a href=\"https:\/\/gcloudvn.com\/en\/google-cloud-platform\/\">GCP<\/a>. Instead of specifying policies in multiple places, carry the cognitive load of translating policies from \u201cwhat you want the storage system to do\u201d to \u201chow your data will behave\u201d Dataplex provides a single point for clear policy management. Now, Google is making it easier for you with BigLake.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-weight: 400;\">Earlier this year, Google made BigLake generally available, BigLake unifies the data fabric between Data Lake and Data Warehouse by extending <\/span><a href=\"https:\/\/gcloudvn.com\/en\/bigquery\/\"><span style=\"font-weight: 400;\">BigQuery<\/span><\/a><span style=\"font-weight: 400;\"> storage to open file formats. Today, we announce BigLake Integration with <\/span><span style=\"font-weight: 400;\">Dataplex<\/span><span style=\"font-weight: 400;\"> (available in preview). This integration eliminates the configuration steps for the admin taking advantage of BigLake and managing policies across GCS and BigQuery from a unified console.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-weight: 400;\">Previously,  you could point Dataplex at a <\/span><span style=\"font-weight: 400;\">Google Cloud Storage (GCS)<\/span><span style=\"font-weight: 400;\"> bucket, and Dataplex will <\/span><span style=\"font-weight: 400;\">detect<\/span><span style=\"font-weight: 400;\"> discover and extract all metadata from the data lake and register this metadata in BigQuery (and Dataproc Metastore, Data Catalog) for analysis and search. With the BigLake integration capability, we are building on this capability by allowing an \u201cupgrade\u201d of a bucket asset, and instead of just creating external tables in BigQuery for analysis - Dataplex will create policy-capable BigLake tables!<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-weight: 400;\">The immediate implication is that admins can now assign column, row, and table policies to the BigLake tables auto-created by Dataplex, as with BigLake - the infrastructure (GCS) layer is separate from the analysis layer (BigQuery). Dataplex will handle the creation of a BigQuery connection and a BigQuery publishing dataset and ensure the BigQuery service account has the correct permissions on the bucket.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-weight: 400;\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-10858 size-full\" src=\"https:\/\/gcloudvn.com\/wp-content\/uploads\/2023\/02\/Screenshot_66.png\" alt=\"Automate administration, expand data structures with Google Dataplex integration - BigLake 1\" width=\"600\" height=\"409\" srcset=\"https:\/\/gcloudvn.com\/wp-content\/uploads\/2023\/02\/Screenshot_66.png 600w, https:\/\/gcloudvn.com\/wp-content\/uploads\/2023\/02\/Screenshot_66-300x205.png 300w, https:\/\/gcloudvn.com\/wp-content\/uploads\/2023\/02\/Screenshot_66-474x324.png 474w, https:\/\/gcloudvn.com\/wp-content\/uploads\/2023\/02\/Screenshot_66-18x12.png 18w\" sizes=\"auto, (max-width: 600px) 100vw, 600px\" \/>But wait - there\u2019s more. <\/span><span style=\"font-weight: 400;\">With this release of Dataplex, we are also introducing advanced logging called governance logs.  Governance logs allow tracking the exact state of policy propagation to tables and columns - adding an additional level of detail going beyond the high-level \u201cstatus\u201d for the bucket and into fine-grained status and logs for tables, columns.<\/span><\/p>\n<h2 style=\"text-align: justify;\"><b>What\u2019s next?<\/b><\/h2>\n<ul style=\"text-align: justify;\">\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">We have updated our documentation for <\/span><span style=\"font-weight: 400;\">managing buckets<\/span><span style=\"font-weight: 400;\"> and have additional detail regarding policy propagation and the upgrade process.<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Stay tuned for an exciting  roadmap ahead, with more automation around policy management.<\/span><\/li>\n<\/ul>\n<p style=\"text-align: justify;\"><span style=\"font-weight: 400;\">For more information, please visit:<\/span><\/p>\n<ul style=\"text-align: justify;\">\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Google Cloud Dataplex<\/span><\/li>\n<\/ul>\n<p style=\"text-align: justify;\"><em>Contact Gimasys for advice on a transformation strategy that is right for your business situation and to experience the free Google Cloud Platform service:<\/em><\/p>\n<ul style=\"text-align: justify;\">\n<li><strong>Hotline: Hanoi: 0987 682 505 \u2013 Ho Chi Minh: 0974 417 099<\/strong><\/li>\n<li><strong>Email: gcp@gimasys.com<\/strong><\/li>\n<\/ul>\n<p style=\"text-align: right;\"><strong>Source: <\/strong>Gimasys<\/p>","protected":false},"excerpt":{"rendered":"<p>Harnessing the full potential of data requires breaking down the barriers between open source data formats and data warehouses. At the same time, it is important to enable the data governance team\u2026<\/p>","protected":false},"author":2,"featured_media":10857,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"inline_featured_image":false,"footnotes":""},"categories":[1,135],"tags":[],"class_list":["post-10793","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-kienthuc","category-google-cloud-platform","entry","has-media"],"_links":{"self":[{"href":"https:\/\/gcloudvn.com\/en\/wp-json\/wp\/v2\/posts\/10793","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/gcloudvn.com\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/gcloudvn.com\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/gcloudvn.com\/en\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/gcloudvn.com\/en\/wp-json\/wp\/v2\/comments?post=10793"}],"version-history":[{"count":0,"href":"https:\/\/gcloudvn.com\/en\/wp-json\/wp\/v2\/posts\/10793\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/gcloudvn.com\/en\/wp-json\/wp\/v2\/media\/10857"}],"wp:attachment":[{"href":"https:\/\/gcloudvn.com\/en\/wp-json\/wp\/v2\/media?parent=10793"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/gcloudvn.com\/en\/wp-json\/wp\/v2\/categories?post=10793"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/gcloudvn.com\/en\/wp-json\/wp\/v2\/tags?post=10793"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}