{"id":8797,"date":"2022-04-04T15:43:36","date_gmt":"2022-04-04T08:43:36","guid":{"rendered":"https:\/\/gcloudvn.com\/?p=8797"},"modified":"2023-03-23T14:53:43","modified_gmt":"2023-03-23T07:53:43","slug":"cach-tyson-foods-hinh-dung-lai-nen-tang-du-lieu-cua-ho","status":"publish","type":"post","link":"https:\/\/gcloudvn.com\/en\/kienthuc\/cach-tyson-foods-hinh-dung-lai-nen-tang-du-lieu-cua-ho\/","title":{"rendered":"Ingestion as a Service: How Tyson Foods reimagined their Data Platform"},"content":{"rendered":"<p style=\"text-align: justify;\"><span style=\"font-weight: 400;\">As data environments become more complex, companies are turning to streaming analytics solutions that analyze data as it\u2019s ingested and deliver immediate, high-value insights into what is happening now. These insights enable decision makers to act in real time to take advantage of opportunities or respond to issues as they occur.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-weight: 400;\">While understanding what is happening now has great business value, forward-thinking companies are taking things a step further, using real-time analytics integrated with artificial intelligence (AI) and business intelligence (BI) to answer the question, \u201cwhat might happen in the future?\u201d Arkansas-based Tyson Foods has embraced AI\/BI analytics to enable predictive insights that unlock new opportunities and drive future growth.<\/span><\/p>\n<div id=\"ez-toc-container\" class=\"ez-toc-v2_0_80 counter-hierarchy ez-toc-counter ez-toc-grey ez-toc-container-direction\">\n<div class=\"ez-toc-title-container\">\n<p class=\"ez-toc-title\" style=\"cursor:inherit\">Table of contents<\/p>\n<span class=\"ez-toc-title-toggle\"><a href=\"#\" class=\"ez-toc-pull-right ez-toc-btn ez-toc-btn-xs ez-toc-btn-default ez-toc-toggle\" aria-label=\"Toggle Table of Content\"><span class=\"ez-toc-js-icon-con\"><span class=\"\"><span class=\"eztoc-hide\" style=\"display:none;\">Toggle<\/span><span class=\"ez-toc-icon-toggle-span\"><svg style=\"fill: #999;color:#999\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" class=\"list-377408\" width=\"20px\" height=\"20px\" viewbox=\"0 0 24 24\" fill=\"none\"><path d=\"M6 6H4v2h2V6zm14 0H8v2h12V6zM4 11h2v2H4v-2zm16 0H8v2h12v-2zM4 16h2v2H4v-2zm16 0H8v2h12v-2z\" fill=\"currentColor\"><\/path><\/svg><svg style=\"fill: #999;color:#999\" class=\"arrow-unsorted-368013\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"10px\" height=\"10px\" viewbox=\"0 0 24 24\" version=\"1.2\" baseprofile=\"tiny\"><path d=\"M18.2 9.3l-6.2-6.3-6.2 6.3c-.2.2-.3.4-.3.7s.1.5.3.7c.2.2.4.3.7.3h11c.3 0 .5-.1.7-.3.2-.2.3-.5.3-.7s-.1-.5-.3-.7zM5.8 14.7l6.2 6.3 6.2-6.3c.2-.2.3-.5.3-.7s-.1-.5-.3-.7c-.2-.2-.4-.3-.7-.3h-11c-.3 0-.5.1-.7.3-.2.2-.3.5-.3.7s.1.5.3.7z\"\/><\/svg><\/span><\/span><\/span><\/a><\/span><\/div>\n<nav><ul class='ez-toc-list ez-toc-list-level-1' ><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-1\" href=\"https:\/\/gcloudvn.com\/en\/kienthuc\/cach-tyson-foods-hinh-dung-lai-nen-tang-du-lieu-cua-ho\/#Tao_ra_mot_cap_song_sinh_ky_thuat_so_cho_toan_cong_ty_tri_tue_duoc_ket_noi\" >Creating a digital twin for connected intelligence company wide<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-2\" href=\"https:\/\/gcloudvn.com\/en\/kienthuc\/cach-tyson-foods-hinh-dung-lai-nen-tang-du-lieu-cua-ho\/#Giai_quyet_van_de_nhap_de_co_thoi_gian_thong_tin_chi_tiet_nhanh_hon\" >Solving the ingestion problem for faster time to insights<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-3\" href=\"https:\/\/gcloudvn.com\/en\/kienthuc\/cach-tyson-foods-hinh-dung-lai-nen-tang-du-lieu-cua-ho\/#Cac_dich_vu_cua_Google_Cloud_giup_tao_nen_tang_nhap_DICE_Tao_nen_tang_nhap_DICE_voi_cac_dich_vu_cua_Google_Cloud\" >Creating DICE ingestion platform with Google Cloud services<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-4\" href=\"https:\/\/gcloudvn.com\/en\/kienthuc\/cach-tyson-foods-hinh-dung-lai-nen-tang-du-lieu-cua-ho\/#Trien_khai_DICE_cho_hang_nghin_cong_viec_nhap_moi_ngay\" >Rolling DICE for thousands of ingestion jobs each day<\/a><\/li><\/ul><\/nav><\/div>\n<h2 style=\"text-align: justify;\"><span class=\"ez-toc-section\" id=\"Tao_ra_mot_cap_song_sinh_ky_thuat_so_cho_toan_cong_ty_tri_tue_duoc_ket_noi\"><\/span><strong>Creating a digital twin for connected intelligence company wide<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p style=\"text-align: justify;\"><span style=\"font-weight: 400;\">Before using AI\/BI, Tyson\u2019s analytics capabilities consisted of traditional BI solutions focused on KPIs and simplifying data so that humans could understand it. Tyson wanted to leverage its data to uncover ways to improve current processes and grow its business. But with BI alone, Tyson struggled to use data to run the simulations and scenarios essential to make educated decisions. To keep growing, it had to embrace the complexity of its data, building ways to analyze it and use it to inform decision making.\u00a0<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-weight: 400;\">Tyson\u2019s on-premises analytics solutions limited its ability to be aggressive and make intelligent, timely, prescriptive decisions. The solution was to create a digital twin to scale optimizations within business processes, moving from local optimizations to system-wide connected optimizations. Doing so meant shifting entirely to cloud computing, with an initial focus on building the ingestion component of the digital twin platform.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-weight: 400;\">Investing in a digital twin enabled Tyson to accelerate new capabilities like supply chain simulation \u201cwhat-if\u201d scenarios, prescriptive price elasticity recommendations, and improvement of customer intimacy.\u00a0<\/span><\/p>\n<h2 style=\"text-align: justify;\"><span class=\"ez-toc-section\" id=\"Giai_quyet_van_de_nhap_de_co_thoi_gian_thong_tin_chi_tiet_nhanh_hon\"><\/span><strong>Solving the ingestion problem for faster time to insights<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p style=\"text-align: justify;\"><span style=\"font-weight: 400;\">Before moving on <a href=\"https:\/\/gcloudvn.com\/en\/google-cloud-platform\/\">Google Cloud<\/a>, analytics projects where Tyson faced uncertainty about how to get the data. This problem is very common and causes project times to stretch by weeks or even months due to the need to write and support a one-time data entry process on the front end. This problem also prevents IT teams from delivering analytics solutions fast enough for the business to take full advantage of them.\u00a0<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-weight: 400;\">To solve this analytics problem, the team created Data Ingestion Compute Engine (DICE). DICE is a Google Cloud-hosted, open-source, cloud-native ingestion platform developed to provide configuration-based, no-ops, code-free ingestion from disparate enterprise data systems, both internal and external. It is centered on three high-level goals:<\/span><\/p>\n<ol style=\"text-align: justify;\">\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Accelerate the speed of delivery of IT analytics solutions<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Enable growth of IT capabilities to produce meaningful insight<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Reduce long-term total cost of ownership for ingestion solutions<\/span><\/li>\n<\/ol>\n<h2 style=\"text-align: justify;\"><span class=\"ez-toc-section\" id=\"Cac_dich_vu_cua_Google_Cloud_giup_tao_nen_tang_nhap_DICE_Tao_nen_tang_nhap_DICE_voi_cac_dich_vu_cua_Google_Cloud\"><\/span><strong>Creating DICE ingestion platform with Google Cloud services<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p style=\"text-align: justify;\"><span style=\"font-weight: 400;\">Teams use DICE to set up secure data ingestion jobs in minutes without having to manage complex connections or write, deploy, and support their own code. DICE enables unbound scale, highly parallel processing, DevSecOps, open source, and the implementation of Lambda Data Architecture.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-weight: 400;\">A DICE job is the logical unit of work in the DICE platform, consisting of immutable and mutable configurations persisted as JSON documents stored in Firestore. <\/span><span style=\"font-weight: 400;\">.<\/span><span style=\"font-weight: 400;\">The job exists as an instruction set for the DICE data engine, which is Apache Beam running Dataflow to instruct which data to pull, how to pull it, how often to pull it, how to process it, when it changes, and where to direct it. <\/span><span style=\"font-weight: 400;\">.<\/span><span style=\"font-weight: 400;\"> .<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-weight: 400;\">Two of DICE\u2019s primary layers include the metadata engine and the data engine. The metadata engine is responsible for the creation and management of DICE job configuration and orchestration. It is made up of many microservices that interact with multiple Google Cloud services, including the job configuration creation API, job build configuration helper API, and job execution scheduler API.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-weight: 400;\">\u00a0The data engine is responsible for the physical ingestion of data, the change detection processing of that data, and the delivery of that data to specified targets. The data engine is Java code that uses the Apache Beam unified programming model and runs in Dataflow. It is comprised of streaming, jobs, and Dataflow flex template batch jobs. Logically, the data engine is segmented across three layers: the inbound processing layer, the DICE file system layer, and the target processing layer, which takes the data from the DICE file system and moves it to targets. <\/span><a href=\"https:\/\/beam.apache.org\/\" rel=\"nofollow noopener\" target=\"_blank\"><span style=\"font-weight: 400;\">.<\/span><\/a><span style=\"font-weight: 400;\"> .<\/span><\/p>\n<h3 style=\"text-align: justify;\"><strong><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-8800\" src=\"https:\/\/gcloudvn.com\/wp-content\/uploads\/2022\/04\/unnamed-5.png\" alt=\"Ingestion as a Service: How Tyson Foods reimagined their Data Platform\" width=\"600\" height=\"509\" srcset=\"https:\/\/gcloudvn.com\/wp-content\/uploads\/2022\/04\/unnamed-5.png 512w, https:\/\/gcloudvn.com\/wp-content\/uploads\/2022\/04\/unnamed-5-300x254.png 300w, https:\/\/gcloudvn.com\/wp-content\/uploads\/2022\/04\/unnamed-5-14x12.png 14w\" sizes=\"auto, (max-width: 600px) 100vw, 600px\" \/><\/strong><\/h3>\n<h2 style=\"text-align: justify;\"><span class=\"ez-toc-section\" id=\"Trien_khai_DICE_cho_hang_nghin_cong_viec_nhap_moi_ngay\"><\/span><strong>Rolling DICE for thousands of ingestion jobs each day<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p style=\"text-align: justify;\"><span style=\"font-weight: 400;\">DICE was first deployed to a production environment in November 2019, and just two years later, it has more than 3,000 data ingestion jobs from more than a hundred disparate data systems, both internal and external to Tyson Foods. Most of these jobs run multiple times a day. On a daily basis the DICE environment sees more than 25,000 Dataflow jobs running and an average of 3.25 terabytes of new data being ingested.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-weight: 400;\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-8799\" src=\"https:\/\/gcloudvn.com\/wp-content\/uploads\/2022\/04\/unnamed-6.png\" alt=\"How Tyson Foods Reimagined Their Data Platform 2\" width=\"600\" height=\"224\" srcset=\"https:\/\/gcloudvn.com\/wp-content\/uploads\/2022\/04\/unnamed-6.png 512w, https:\/\/gcloudvn.com\/wp-content\/uploads\/2022\/04\/unnamed-6-300x112.png 300w, https:\/\/gcloudvn.com\/wp-content\/uploads\/2022\/04\/unnamed-6-18x7.png 18w\" sizes=\"auto, (max-width: 600px) 100vw, 600px\" \/><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-weight: 400;\">DICE supports ingestion from many different types of technologies, including <\/span><a href=\"https:\/\/gcloudvn.com\/en\/bigquery\/\"><span style=\"font-weight: 400;\">BigQuery<\/span><\/a><span style=\"font-weight: 400;\">, SQL Server, SAP HANA, <\/span><span style=\"font-weight: 400;\">Postgres<\/span><span style=\"font-weight: 400;\">, Oracle, <\/span><span style=\"font-weight: 400;\">MySQL<\/span><span style=\"font-weight: 400;\">, Db2, various types of file systems, and FTP servers. Additionally, DICE supports target platform technologies for ingestion jobs that include multiple JDBC targets, multiple file system targets, and BigQuery and queue-based store and forward technologies.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-weight: 400;\">The platform continues to see linear growth of DICE jobs, all while keeping platform costs relatively flat. With increasing demand for the platform, Tyson\u2019s IT team is constantly enhancing DICE to support new sources and targets.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-weight: 400;\">This intelligent platform keeps adding new value and makes it simple for Tyson to take advantage of its data. This innovation is a necessity in this fast-changing world of digital business in which companies must transform a high volume of complex data into actionable insight.<\/span><\/p>","protected":false},"excerpt":{"rendered":"<p>Khi m\u00f4i tr\u01b0\u1eddng d\u1eef li\u1ec7u tr\u1edf n\u00ean ph\u1ee9c t\u1ea1p h\u01a1n, c\u00e1c c\u00f4ng ty \u0111ang chuy\u1ec3n sang s\u1eed d\u1ee5ng c\u00e1c gi\u1ea3i ph\u00e1p ph\u00e2n t\u00edch tr\u1ef1c tuy\u1ebfn \u0111\u1ec3 ph\u00e2n t\u00edch d\u1eef li\u1ec7u khi n\u00f3 \u0111\u01b0\u1ee3c nh\u1eadp v\u00e0 cung c\u1ea5p th\u00f4ng tin chi&hellip;<\/p>","protected":false},"author":2,"featured_media":11544,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"inline_featured_image":false,"footnotes":""},"categories":[1],"tags":[],"class_list":["post-8797","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-kienthuc","entry","has-media"],"_links":{"self":[{"href":"https:\/\/gcloudvn.com\/en\/wp-json\/wp\/v2\/posts\/8797","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/gcloudvn.com\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/gcloudvn.com\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/gcloudvn.com\/en\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/gcloudvn.com\/en\/wp-json\/wp\/v2\/comments?post=8797"}],"version-history":[{"count":0,"href":"https:\/\/gcloudvn.com\/en\/wp-json\/wp\/v2\/posts\/8797\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/gcloudvn.com\/en\/wp-json\/wp\/v2\/media\/11544"}],"wp:attachment":[{"href":"https:\/\/gcloudvn.com\/en\/wp-json\/wp\/v2\/media?parent=8797"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/gcloudvn.com\/en\/wp-json\/wp\/v2\/categories?post=8797"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/gcloudvn.com\/en\/wp-json\/wp\/v2\/tags?post=8797"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}