{"id":8372,"date":"2021-12-22T15:57:13","date_gmt":"2021-12-22T08:57:13","guid":{"rendered":"https:\/\/gcloudvn.com\/?p=8372"},"modified":"2023-03-24T11:42:19","modified_gmt":"2023-03-24T04:42:19","slug":"toi-uu-bigquery-voi-nguon-du-lieu-trong-google-cloud-vmware-engine","status":"publish","type":"post","link":"https:\/\/gcloudvn.com\/en\/kienthuc\/toi-uu-bigquery-voi-nguon-du-lieu-trong-google-cloud-vmware-engine\/","title":{"rendered":"Optimizing BigQuery with data sources in Google Cloud VMware Engine"},"content":{"rendered":"<figure id=\"attachment_8377\" aria-describedby=\"caption-attachment-8377\" style=\"width: 600px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-8377\" src=\"https:\/\/gcloudvn.com\/wp-content\/uploads\/2021\/12\/z3044313174053_dbbb503255bd4243d5bbe0e2e5f39ccd-300x130.jpg\" alt=\"Optimizing BigQuery with data sources in Google Cloud VMware Engine\" width=\"600\" height=\"261\" srcset=\"https:\/\/gcloudvn.com\/wp-content\/uploads\/2021\/12\/z3044313174053_dbbb503255bd4243d5bbe0e2e5f39ccd-300x130.jpg 300w, https:\/\/gcloudvn.com\/wp-content\/uploads\/2021\/12\/z3044313174053_dbbb503255bd4243d5bbe0e2e5f39ccd-1024x445.jpg 1024w, https:\/\/gcloudvn.com\/wp-content\/uploads\/2021\/12\/z3044313174053_dbbb503255bd4243d5bbe0e2e5f39ccd-768x334.jpg 768w, https:\/\/gcloudvn.com\/wp-content\/uploads\/2021\/12\/z3044313174053_dbbb503255bd4243d5bbe0e2e5f39ccd-1536x668.jpg 1536w, https:\/\/gcloudvn.com\/wp-content\/uploads\/2021\/12\/z3044313174053_dbbb503255bd4243d5bbe0e2e5f39ccd-2048x891.jpg 2048w, https:\/\/gcloudvn.com\/wp-content\/uploads\/2021\/12\/z3044313174053_dbbb503255bd4243d5bbe0e2e5f39ccd-18x8.jpg 18w\" sizes=\"auto, (max-width: 600px) 100vw, 600px\" \/><figcaption id=\"caption-attachment-8377\" class=\"wp-caption-text\">Optimizing BigQuery with data sources in Google Cloud VMware Engine<\/figcaption><\/figure>\n<p style=\"text-align: justify;\"><span style=\"font-weight: 400;\">For customers who have migrated data sources from the on-premise system to <\/span><span style=\"font-weight: 400;\">Google Cloud VMware<\/span> <span style=\"font-weight: 400;\">Engine<\/span> (<a href=\"https:\/\/cloud.google.com\/vmware-engine\" rel=\"nofollow noopener\" target=\"_blank\">https:\/\/cloud.google.com\/vmware-engine<\/a>) <span style=\"font-weight: 400;\">and want to use data and analytics services provided by Google Cloud. One of the goals of customers who choose Google Cloud is to leverage Google Cloud analytics with their data sets. If you are an IT decision maker or data architect who wants to quickly leverage the power of your data with Google analytics, this blog describes methods for accessing your data in <\/span><a href=\"https:\/\/gcloudvn.com\/en\/bigquery\/\"><span style=\"font-weight: 400;\">BigQuery<\/span><\/a><span style=\"font-weight: 400;\">, n\u01a1i c\u00f3 th\u1ec3 th\u1ef1c hi\u1ec7n ph\u00e2n t\u00edch n\u00e2ng cao v\u00e0 machine learning tr\u00ean b\u1ed9 d\u1eef li\u1ec7u c\u1ee7a b\u1ea1n.\u00a0<\/span><\/p>\n<p><strong>&gt; Reference:<\/strong><\/p>\n<ul>\n<li><a href=\"https:\/\/gcloudvn.com\/en\/kienthuc\/introducing-datastream-for-bigquery\/\">Introducing Datastream for Google BigQuery<\/a><\/li>\n<li><a href=\"https:\/\/gcloudvn.com\/en\/kienthuc\/phan-tich-khoi-du-lieu-lon-voi-bigquery-va-google-sheets\/\">Analyze Big Data with BigQuery and Google Sheets<\/a><\/li>\n<li><a href=\"https:\/\/gcloudvn.com\/en\/kienthuc\/cach-di-chuyen-kho-du-lieu-on-premises-sang-bigquery-tren-google-cloud\/\">How to migrate data warehouse on premises to BigQuery on Google Cloud<\/a><\/li>\n<\/ul>\n<div id=\"ez-toc-container\" class=\"ez-toc-v2_0_80 counter-hierarchy ez-toc-counter ez-toc-grey ez-toc-container-direction\">\n<div class=\"ez-toc-title-container\">\n<p class=\"ez-toc-title\" style=\"cursor:inherit\">Table of contents<\/p>\n<span class=\"ez-toc-title-toggle\"><a href=\"#\" class=\"ez-toc-pull-right ez-toc-btn ez-toc-btn-xs ez-toc-btn-default ez-toc-toggle\" aria-label=\"Toggle Table of Content\"><span class=\"ez-toc-js-icon-con\"><span class=\"\"><span class=\"eztoc-hide\" style=\"display:none;\">Toggle<\/span><span class=\"ez-toc-icon-toggle-span\"><svg style=\"fill: #999;color:#999\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" class=\"list-377408\" width=\"20px\" height=\"20px\" viewbox=\"0 0 24 24\" fill=\"none\"><path d=\"M6 6H4v2h2V6zm14 0H8v2h12V6zM4 11h2v2H4v-2zm16 0H8v2h12v-2zM4 16h2v2H4v-2zm16 0H8v2h12v-2z\" fill=\"currentColor\"><\/path><\/svg><svg style=\"fill: #999;color:#999\" class=\"arrow-unsorted-368013\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"10px\" height=\"10px\" viewbox=\"0 0 24 24\" version=\"1.2\" baseprofile=\"tiny\"><path d=\"M18.2 9.3l-6.2-6.3-6.2 6.3c-.2.2-.3.4-.3.7s.1.5.3.7c.2.2.4.3.7.3h11c.3 0 .5-.1.7-.3.2-.2.3-.5.3-.7s-.1-.5-.3-.7zM5.8 14.7l6.2 6.3 6.2-6.3c.2-.2.3-.5.3-.7s-.1-.5-.3-.7c-.2-.2-.4-.3-.7-.3h-11c-.3 0-.5.1-.7.3-.2.2-.3.5-.3.7s.1.5.3.7z\"\/><\/svg><\/span><\/span><\/span><\/a><\/span><\/div>\n<nav><ul class='ez-toc-list ez-toc-list-level-1' ><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-1\" href=\"https:\/\/gcloudvn.com\/en\/kienthuc\/toi-uu-bigquery-voi-nguon-du-lieu-trong-google-cloud-vmware-engine\/#Tai_sao\" >Why?<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-2\" href=\"https:\/\/gcloudvn.com\/en\/kienthuc\/toi-uu-bigquery-voi-nguon-du-lieu-trong-google-cloud-vmware-engine\/#Dieu_nay_bao_gom_nhung_gi\" >What does this include?<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-3\" href=\"https:\/\/gcloudvn.com\/en\/kienthuc\/toi-uu-bigquery-voi-nguon-du-lieu-trong-google-cloud-vmware-engine\/#Cloud_Data_Fusion\" >Cloud Data Fusion:\u00a0<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-4\" href=\"https:\/\/gcloudvn.com\/en\/kienthuc\/toi-uu-bigquery-voi-nguon-du-lieu-trong-google-cloud-vmware-engine\/#Google_Cloud_Datastream\" >Google Cloud Datastream:<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-5\" href=\"https:\/\/gcloudvn.com\/en\/kienthuc\/toi-uu-bigquery-voi-nguon-du-lieu-trong-google-cloud-vmware-engine\/#Lam_the_nao_de_bat_dau\" >How to get started?<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-6\" href=\"https:\/\/gcloudvn.com\/en\/kienthuc\/toi-uu-bigquery-voi-nguon-du-lieu-trong-google-cloud-vmware-engine\/#Ket_luan\" >Conclude:<\/a><\/li><\/ul><\/nav><\/div>\n<h2 style=\"text-align: justify;\"><span class=\"ez-toc-section\" id=\"Tai_sao\"><\/span><strong>Why?<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p style=\"text-align: justify;\"><span style=\"font-weight: 400;\">Data consumption and analytics are at the forefront of technology. Customers today use and manage large amounts of data and resources. These challenges create an opportunity for Google Cloud to help manage and understand your existing databases without the need for costly re-architecture of your source documents or data locations. This blog covers approaches to accessing Google Cloud data and analytics services using your existing data <\/span><b>without having to re-architect your database<\/b><span style=\"font-weight: 400;\">. Once your data sources are in Google Cloud VMware Engine, Google&#039;s highly available and fault-tolerant infrastructure can be leveraged to enhance the performance of your data pipelines. These solutions aim to reduce the time it takes to extract value from your data sets with cloud-native analytics available through BigQuery.\u00a0\u00a0<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-weight: 400;\">This Google Cloud VMware Engine migration solution brings advantages to all parts of data operations. Database administrators (DBAs) and virtual infrastructure\/cloud administrators can use familiar environments similar to on-premises in the cloud. On-premises infrastructure teams can enable data scientist \/ AI \/ machine learning (ML) teams to use familiar toolsets. These teams now have access to Google Cloud AI\/ML\/data analytics capabilities for their on-premises data.<\/span><\/p>\n<figure id=\"attachment_8373\" aria-describedby=\"caption-attachment-8373\" style=\"width: 600px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-8373\" src=\"https:\/\/gcloudvn.com\/wp-content\/uploads\/2021\/12\/1_bq_vmware.max-2800x2800-1-300x143.png\" alt=\"This Google Cloud VMware Engine migration solution provides advantages for parts of data operations\" width=\"600\" height=\"285\" srcset=\"https:\/\/gcloudvn.com\/wp-content\/uploads\/2021\/12\/1_bq_vmware.max-2800x2800-1-300x143.png 300w, https:\/\/gcloudvn.com\/wp-content\/uploads\/2021\/12\/1_bq_vmware.max-2800x2800-1-1024x487.png 1024w, https:\/\/gcloudvn.com\/wp-content\/uploads\/2021\/12\/1_bq_vmware.max-2800x2800-1-768x365.png 768w, https:\/\/gcloudvn.com\/wp-content\/uploads\/2021\/12\/1_bq_vmware.max-2800x2800-1-1536x730.png 1536w, https:\/\/gcloudvn.com\/wp-content\/uploads\/2021\/12\/1_bq_vmware.max-2800x2800-1-2048x974.png 2048w, https:\/\/gcloudvn.com\/wp-content\/uploads\/2021\/12\/1_bq_vmware.max-2800x2800-1-18x9.png 18w\" sizes=\"auto, (max-width: 600px) 100vw, 600px\" \/><figcaption id=\"caption-attachment-8373\" class=\"wp-caption-text\">This Google Cloud VMware Engine migration solution provides advantages for parts of data operations<\/figcaption><\/figure>\n<p style=\"text-align: justify;\"><span style=\"font-weight: 400;\">For example, if you want to explore cross-selling opportunities in your products, the first step is to ensure that the payment and product usage data sets across your products are connected for analysis . The DBA team will identify these data sets and the infrastructure team will enable access to these sources. The application team then copies this data to BigQuery and uses approaches such as <\/span><span style=\"font-weight: 400;\">BigQuery ML Recommendations<\/span><span style=\"font-weight: 400;\"> (<a href=\"https:\/\/cloud.google.com\/bigquery-ml\/docs\/bigqueryml-mf-explicit-tutorial\" rel=\"nofollow noopener\" target=\"_blank\">https:\/\/cloud.google.com\/bigquery-ml\/docs\/bigqueryml-mf-explicit-tutorial<\/a>) to explore cross-selling opportunities. Another example use case is forecasting usage growth for operations and growth planning. Once your sales data is replicated in BigQuery, advanced methods <\/span><a href=\"https:\/\/cloud.google.com\/bigquery-ml\/docs\/arima-multiple-time-series-forecasting-tutorial\" rel=\"nofollow noopener\" target=\"_blank\"><span style=\"font-weight: 400;\">time series forecasting<\/span><\/a><span style=\"font-weight: 400;\"> will be available to your dataset.<\/span><\/p>\n<h2 style=\"text-align: justify;\"><span class=\"ez-toc-section\" id=\"Dieu_nay_bao_gom_nhung_gi\"><\/span><strong>What does this include?<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p style=\"text-align: justify;\"><span style=\"font-weight: 400;\">Methods for replicating your relational datasets in BigQuery in a private and secure way using <\/span><span style=\"font-weight: 400;\">Google Cloud Data Fusion (<a href=\"https:\/\/cloud.google.com\/data-fusion\/\" rel=\"nofollow noopener\" target=\"_blank\">https:\/\/cloud.google.com\/data-fusion\/<\/a>)<\/span><span style=\"font-weight: 400;\"> or <\/span><span style=\"font-weight: 400;\">Google Cloud Datastream (<a href=\"https:\/\/cloud.google.com\/datastream\" rel=\"nofollow noopener\" target=\"_blank\">https:\/\/cloud.google.com\/datastream<\/a>)<\/span><span style=\"font-weight: 400;\">. Datafusion is an ETL tool that supports many different types of data pipelines. Datastream is a service for collecting and replicating change data. Using both of these services, data stays within your projects in Google Cloud, and internal IPs are used to access the data. We&#039;ll focus on real-time replication so you can continuously access your data from operational data stores, such as SQL Server, MySQL, and Oracle in BigQuery.\u00a0<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-weight: 400;\">Moving data from your data sources to the cloud and maintaining the data pipeline to your data warehouse through Extract \u2013 Transform \u2013 Load (ETL) is an activity. time-consuming. An alternative approach is ELT (Extract Load Transform). ELT methods load data into the target system (e.g. BigQuery) before transforming the data. The ELT process is often preferred over the traditional ETL process because it is simpler to realize and loads data faster.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-weight: 400;\">With your data sets now in Google Cloud, data teams can use Cloud Data Fusion and Datastream over the high-speed, low-latency Google Cloud network to replicate or move data from the infrastructure Your VMware to different destinations in <a href=\"https:\/\/gcloudvn.com\/en\/google-cloud-platform\/\">Google Cloud Platform<\/a>, such as Google Cloud native Storage buckets or BigQuery.\u00a0<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-weight: 400;\">For simplicity, it will be assumed that all services are used in the same project. And we&#039;ll also discuss some pricing implications when moving data from Google Cloud VMware Engine from on-premises or another virtual private cloud (VPC).<\/span><\/p>\n<figure id=\"attachment_8374\" aria-describedby=\"caption-attachment-8374\" style=\"width: 600px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-8374\" src=\"https:\/\/gcloudvn.com\/wp-content\/uploads\/2021\/12\/2_bq_vmware.1000064920000658.max-2800x2800-1-300x99.png\" alt=\"Replicate the dataset in BigQuery\" width=\"600\" height=\"197\" srcset=\"https:\/\/gcloudvn.com\/wp-content\/uploads\/2021\/12\/2_bq_vmware.1000064920000658.max-2800x2800-1-300x99.png 300w, https:\/\/gcloudvn.com\/wp-content\/uploads\/2021\/12\/2_bq_vmware.1000064920000658.max-2800x2800-1-1024x337.png 1024w, https:\/\/gcloudvn.com\/wp-content\/uploads\/2021\/12\/2_bq_vmware.1000064920000658.max-2800x2800-1-768x253.png 768w, https:\/\/gcloudvn.com\/wp-content\/uploads\/2021\/12\/2_bq_vmware.1000064920000658.max-2800x2800-1-1536x505.png 1536w, https:\/\/gcloudvn.com\/wp-content\/uploads\/2021\/12\/2_bq_vmware.1000064920000658.max-2800x2800-1-2048x674.png 2048w, https:\/\/gcloudvn.com\/wp-content\/uploads\/2021\/12\/2_bq_vmware.1000064920000658.max-2800x2800-1-18x6.png 18w\" sizes=\"auto, (max-width: 600px) 100vw, 600px\" \/><figcaption id=\"caption-attachment-8374\" class=\"wp-caption-text\">Replicate the dataset in BigQuery<\/figcaption><\/figure>\n<h2 style=\"text-align: justify;\"><span class=\"ez-toc-section\" id=\"Cloud_Data_Fusion\"><\/span><strong>Cloud Data Fusion:\u00a0<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p style=\"text-align: justify;\"><span style=\"font-weight: 400;\">Cloud Data Fusion provides an intuitive point-and-click interface that enables ETL\/ELT data pipeline deployment without code. Cloud Data Fusion also provides a replication accelerator that allows you to replicate your tables into BigQuery.\u00a0<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-weight: 400;\">Cloud Data Fusion internally sets up a tenant project with its own VPCs to manage Cloud Data Fusion resources. To access data sources in Google Cloud VMware Engine using Cloud Data Fusion, we use a reverse proxy on the primary VPC. This is depicted in the image below.<\/span><\/p>\n<p style=\"text-align: justify;\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-8375\" src=\"https:\/\/gcloudvn.com\/wp-content\/uploads\/2021\/12\/3_bq_vmware.max-2800x2800-1-300x115.png\" alt=\"Optimize BigQuery with data sources in Google Cloud VMware Engine 4\" width=\"600\" height=\"229\" srcset=\"https:\/\/gcloudvn.com\/wp-content\/uploads\/2021\/12\/3_bq_vmware.max-2800x2800-1-300x115.png 300w, https:\/\/gcloudvn.com\/wp-content\/uploads\/2021\/12\/3_bq_vmware.max-2800x2800-1-1024x391.png 1024w, https:\/\/gcloudvn.com\/wp-content\/uploads\/2021\/12\/3_bq_vmware.max-2800x2800-1-768x293.png 768w, https:\/\/gcloudvn.com\/wp-content\/uploads\/2021\/12\/3_bq_vmware.max-2800x2800-1-1536x587.png 1536w, https:\/\/gcloudvn.com\/wp-content\/uploads\/2021\/12\/3_bq_vmware.max-2800x2800-1-2048x783.png 2048w, https:\/\/gcloudvn.com\/wp-content\/uploads\/2021\/12\/3_bq_vmware.max-2800x2800-1-18x7.png 18w\" sizes=\"auto, (max-width: 600px) 100vw, 600px\" \/><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-weight: 400;\">In this case, Google has its data workloads running on the Google Cloud VMware Engine instance in the project. The Google Cloud VMware Engine environment is accessed through a project-level VPC peering with the Google Cloud VMware Engine. Version <\/span><span style=\"font-weight: 400;\">Google <a href=\"https:\/\/gcloudvn.com\/en\/compute-engine\/\">Compute Engine<\/a><\/span> <span style=\"font-weight: 400;\">at the VPC project level exposes a reverse proxy to the Google Cloud VMware Engine database for services that cannot directly access the Google Cloud VMware Engine instance. Cloud Data Fusion instances are enabled with private IP access and peering to the primary VPC, and data can be accessed through a reverse proxy instance. The procedure for setting up internal IP and peer-to-peer access on Cloud Data Fusion is described in <\/span><a href=\"https:\/\/cloud.google.com\/data-fusion\/docs\/how-to\/create-private-ip\" rel=\"nofollow noopener\" target=\"_blank\"><span style=\"font-weight: 400;\">document<\/span><\/a><span style=\"font-weight: 400;\"> This.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-weight: 400;\">Once this peering process is complete, use the Java Database Connectivity connector in Cloud Data Fusion to access our database for replication or for advanced ETL operations. To enable change data capture, we need to enable the database in Google Cloud VMware Engine to track and capture changes to the database. This entire setup and scaling process is described in the documentation for <\/span><span style=\"font-weight: 400;\">MySQL (<a href=\"https:\/\/cloud.google.com\/data-fusion\/docs\/tutorials\/replicating-data\/mysql-to-bigquery\" rel=\"nofollow noopener\" target=\"_blank\">https:\/\/cloud.google.com\/data-fusion\/docs\/tutorials\/replicating-data\/mysql-to-bigquery<\/a>)<\/span> <span style=\"font-weight: 400;\">and <\/span><span style=\"font-weight: 400;\">SQL Server<\/span><span style=\"font-weight: 400;\">.\u00a0<\/span><\/p>\n<h2 style=\"text-align: justify;\"><span class=\"ez-toc-section\" id=\"Google_Cloud_Datastream\"><\/span><strong>Google Cloud Datastream:<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p style=\"text-align: justify;\"><span style=\"font-weight: 400;\">Datastream<\/span> <span style=\"font-weight: 400;\">is a serverless change capture and replication service. You can access streaming, low-latency data from Oracle and MySQL databases on Google Cloud VMware Engine. This approach provides more flexibility in managing data flow pipelines. This solution is currently in general availability and is only available in certain regions.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-weight: 400;\">This option also requires a reverse proxy to be configured in the Google Compute Engine instance. This reverse proxy is used to access data sources in Google Cloud VMware Engine. This option is described in<\/span> <a href=\"https:\/\/cloud.google.com\/datastream\/docs\/private-connectivity#set-up-reverse-proxy\" rel=\"nofollow noopener\" target=\"_blank\"><span style=\"font-weight: 400;\">document<\/span><span style=\"font-weight: 400;\"> This<\/span><\/a><\/p>\n<figure id=\"attachment_8376\" aria-describedby=\"caption-attachment-8376\" style=\"width: 600px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-8376\" src=\"https:\/\/gcloudvn.com\/wp-content\/uploads\/2021\/12\/4_bq_vmware.max-2800x2800-1-300x115.png\" alt=\"Datastream replicates and captures change data without the need for a server\" width=\"600\" height=\"230\" srcset=\"https:\/\/gcloudvn.com\/wp-content\/uploads\/2021\/12\/4_bq_vmware.max-2800x2800-1-300x115.png 300w, https:\/\/gcloudvn.com\/wp-content\/uploads\/2021\/12\/4_bq_vmware.max-2800x2800-1-1024x393.png 1024w, https:\/\/gcloudvn.com\/wp-content\/uploads\/2021\/12\/4_bq_vmware.max-2800x2800-1-768x295.png 768w, https:\/\/gcloudvn.com\/wp-content\/uploads\/2021\/12\/4_bq_vmware.max-2800x2800-1-1536x590.png 1536w, https:\/\/gcloudvn.com\/wp-content\/uploads\/2021\/12\/4_bq_vmware.max-2800x2800-1-2048x786.png 2048w, https:\/\/gcloudvn.com\/wp-content\/uploads\/2021\/12\/4_bq_vmware.max-2800x2800-1-18x7.png 18w\" sizes=\"auto, (max-width: 600px) 100vw, 600px\" \/><figcaption id=\"caption-attachment-8376\" class=\"wp-caption-text\">Datastream replicates and captures change data without the need for a server<\/figcaption><\/figure>\n<p style=\"text-align: justify;\"><span style=\"font-weight: 400;\">You can find the complete setup to use Datastream in <\/span><a href=\"https:\/\/cloud.google.com\/datastream\/docs\/how-to\" rel=\"nofollow noopener\" target=\"_blank\"><span style=\"font-weight: 400;\">instructions on how to do this<\/span><\/a><span style=\"font-weight: 400;\">. To enable replication, we need a stream configured on Datastream, which accesses data from the database and transfers it to the cloud storage. The reverse proxy data access flow needs to be exposed on the customer&#039;s VPC. To pass data to BigQuery, use pre-configured <\/span><a href=\"https:\/\/cloud.google.com\/dataflow\/docs\/guides\/templates\/provided-streaming#datastream-to-bigquery\" rel=\"nofollow noopener\" target=\"_blank\"><span style=\"font-weight: 400;\">Datastream sample to BigQuery<\/span><\/a><span style=\"font-weight: 400;\"> in <\/span><span style=\"font-weight: 400;\">.<\/span><span style=\"font-weight: 400;\">.<\/span><\/p>\n<h2 style=\"text-align: justify;\"><span class=\"ez-toc-section\" id=\"Lam_the_nao_de_bat_dau\"><\/span><strong>How to get started?<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p style=\"text-align: justify;\"><span style=\"font-weight: 400;\">The first step is <\/span><a href=\"https:\/\/cloud.google.com\/vmware-engine\/docs\/workloads\/howto-migrate-vms-using-hcx\" rel=\"nofollow noopener\" target=\"_blank\"><span style=\"font-weight: 400;\">migrate workloads to Google Cloud VMware Engine<\/span><\/a><span style=\"font-weight: 400;\">. <\/span><span style=\"font-weight: 400;\">Your cloud administrator\/architect will typically drive this. If not already identified during the migration phase, the next step is to identify the databases residing on virtual machines hosted in Google Cloud VMware Engine and recreate the existing reports using BigQuery. In most organizations there will be many individuals involved in this process. For example, a data architect may be the best source for information about data sources, a solutions architect will have insights into cost\/performance and other Infrastructure inputs will be needed for network interfaces. The steps below outline one possible approach to enable this movement.\u00a0\u00a0<\/span><\/p>\n<ol style=\"text-align: justify;\">\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Identify data sets residing on virtual machines migrated to Google Cloud VMware Engine used for reports.\u00a0<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Choose the right pipeline (Datastream vs Data Fusion) based on database type and pipeline requirements (price\/performance trade-off and ease of use).<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Based on the data pipeline, select the appropriate region. There are no data export fees within the same region.\u00a0<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Set up a reverse proxy for the Google Cloud VMware Engine dataset.\u00a0<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Set up replication service with performance parameters based on required replication performance.<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Enable analytics and visualization based on business requirements across the data set.<\/span><\/li>\n<\/ol>\n<h2 style=\"text-align: justify;\"><span class=\"ez-toc-section\" id=\"Ket_luan\"><\/span><strong>Conclude:<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p style=\"text-align: justify;\"><span style=\"font-weight: 400;\">Google Cloud VMware Engine service is a quick and easy way to enable data visualization and analytics using your existing data sets. Now you can leverage your existing infrastructure operations posture on VMware to enable cloud analytics without spending time refactoring your database. These approaches enable you to take advantage of the performance benefits of dedicated hardware on Google Cloud, connected to the most advanced data capabilities in the world.<\/span><\/p>\n<p style=\"text-align: right;\"><strong>Source: <a href=\"https:\/\/gcloudvn.com\/en\/\">gcloudvn.com<\/a><\/strong><\/p>","protected":false},"excerpt":{"rendered":"<p>D\u00e0nh cho nh\u1eefng kh\u00e1ch h\u00e0ng \u0111\u00e3 di chuy\u1ec3n c\u00e1c ngu\u1ed3n d\u1eef li\u1ec7u t\u1ea1i h\u1ec7 th\u1ed1ng on-premise sang Google Cloud VMware Engine (https:\/\/cloud.google.com\/vmware-engine) v\u00e0 mu\u1ed1n s\u1eed d\u1ee5ng c\u00e1c d\u1ecbch v\u1ee5 ph\u00e2n t\u00edch v\u00e0 d\u1eef li\u1ec7u do Google Cloud cung c\u1ea5p. M\u1ed9t&hellip;<\/p>","protected":false},"author":2,"featured_media":8377,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"inline_featured_image":false,"footnotes":""},"categories":[1,135],"tags":[],"class_list":["post-8372","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-kienthuc","category-google-cloud-platform","entry","has-media"],"_links":{"self":[{"href":"https:\/\/gcloudvn.com\/en\/wp-json\/wp\/v2\/posts\/8372","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/gcloudvn.com\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/gcloudvn.com\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/gcloudvn.com\/en\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/gcloudvn.com\/en\/wp-json\/wp\/v2\/comments?post=8372"}],"version-history":[{"count":0,"href":"https:\/\/gcloudvn.com\/en\/wp-json\/wp\/v2\/posts\/8372\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/gcloudvn.com\/en\/wp-json\/wp\/v2\/media\/8377"}],"wp:attachment":[{"href":"https:\/\/gcloudvn.com\/en\/wp-json\/wp\/v2\/media?parent=8372"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/gcloudvn.com\/en\/wp-json\/wp\/v2\/categories?post=8372"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/gcloudvn.com\/en\/wp-json\/wp\/v2\/tags?post=8372"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}