C2C’s first event for developers took place on April 26th, 2023 in Sunnyvale, CA. The event focused on data analytics and how organization can optimize their data. Below are some data buzzwords and their definitions, an overview of Dataplex, a product that was demonstrated at the event, and a summary of the key topics discussed.
Data warehouse: A system that is used for reporting and data analysis. A data warehouse is a large storage of data that has been accumulated from a range of sources and helps businesses with decision-making processes.
Data lake: A centralized infrastructure that is designed to store and process large amounts of data. A data lake can store data in its original form and process it in any variety. Data lakes are scalable platforms that allow organizations to ingest data from any source at multiple speeds.
Data Lakehouse: A modern data platform that is a combination of a data warehouse and a data lake.
BigQuery: Serverless architecture built as a data warehouse that works across clouds while scaling with your data. BigQuery allows users to pick the right feature set for workload demands and can match these needs in real time. It can also analyze data across multiple clouds and securely exchange data sets internally or across businesses, making it a platform with scalable analytics.
BigLake: A storage engine that unifies data warehouses and lakes through BigQuery to gain access to data.
Dataplex is a lake administration and data governance tool. It enables organizations to discover, manage and evaluate their data across data lakes and data warehouses. Dataplex also has a variety of features that allow organizations to choose specific items to easily manage data. For example, the tag management feature ensures that specific users have access to the right data by setting policy templates and tags with different sets of data. Dataplex also has automated data quality management features. For example, if a report quotes incorrect numbers, the data can be corrected with automated data tools rather than manually.
Data and and Real Time Analytics
A major point raised at the developer’s event was that data is rooted in an event-driven architecture. For instance, customers who work in finance get highly interested in data in real time during specific periods. This interest is event-based, as it usually occurs when the industry reaches a quarter close. Moving data around can be a difficult task; however, there are certain cloud features that can solve this issue, such as Dataplex. The main concern surrounding organizing data is access control and governance. Customers want to know that steps have been taken to ensure that unauthorized users do not gain access to private data. Visibility and transparency are also core tenets when discussing access to data and its governance tools.