Learn | C2C Community

Reduce Time to Value by Rapidly Onboarding SAP Data on GCP (full recording)

You have options if you want to reduce the time to value for SAP deployments on GCP. Google Cloud solutions such as BigQuery, CloudSQL, AutoML, and Spanner⁠—among others⁠—are available to onboard and will accelerate insights on SAP data. Mike Eacrett, a senior product manager at Google Cloud, and Chai Pydimukkala, Google Cloud Head of Product Management, recently joined C2C for a technical session for SAP architects, data integrators, and data engineers to cover important options for SAP deployments on GCP. The session provided an overview of available solutions, technical requirements, and customer use cases. Watch the video below to see the live presentations, and use the following timestamps to navigate to the segments most relevant to you:(1:50) Mike Eacrett Introduction and Reference Architecture (3:20) BigQuery Connector for SAP: SAP Data Integration (4:25) BigQuery Connector for SAP: Highlights & Value (7:30) BigQuery Connector for SAP: Solution Overview (10:30) BigQuery Connector for SAP: How does it work? (14:35) Data Type Mapping Overview (17:40) Supported Software Requirements (19:55) Chai Pydimukkala Introduction and Cloud Data Fusion (23:15) Cloud Data Fusion Key Capabilities and Personas (31:25) SAP Table Batch Source (34:50) SAP SLT Replication Plugin (36:45) SAP ODP Plugin (38:45) SAP OData Plugin  Extra Credit:  

Categories:Storage and Data TransferSAP

Let's Not Talk About Repatriation: Rich Hoyer of SADA on FinOps and Workload Balancing

In early 2021, Rich Hoyer, Director of Customer FinOps for SADA, published an opinion piece in VentureBeat that refuted the findings of an earlier published article about the cost of hosting workloads in the cloud. In his rebuttal, Hoyer called the article (which was written by representatives of Andreessen Horowitz Capital Management) “dead wrong” with regard to its findings about cloud repatriation and costs.Hoyer’s expertise and his views on doing business in the cloud make him an ideal participant for a C2C Global panel discussion taking place on January 20, at which he will appear alongside representatives of Twitter and Etsy to talk about whether or not enterprises should consider moving workloads off the cloud and into data centers. Hoyer predicts the panel conversation will lean away from the concept of repatriation and more toward the concept of balancing workloads.“I don’t think repatriation is the right term,” Hoyer says. “To me, it’s much more a decision of what workloads should be where, so I would phrase it as rebalancing—as more optimally balancing. Repatriation implies that there’s this lifecycle. That’s just not the way it works. How many startups have workloads that are architected from the ground up and not cloud native? You don’t see that. If you’re cloud native, you start using the stuff as cloud native.” The panel discussion will focus on hybrid workloads, he says, with a specific eye toward what works from a cost standpoint for each individual customer. “We want cloud consumers to be successful, and if they have stuff in the cloud that ought not to be there, they’re going to be unhappy with those workloads,” Hoyer says. “That’s not good for us, it’s not good for Google, it’s not good for anybody. We want only things in the cloud that are going to be successful because customers know they’re getting value from it, because that’s what’s going to cause them to expand and grow in the cloud.”From his FinOps viewpoint, Hoyer says he will be advocating for the process of making decisions around managing spend in public cloud, and the disciplines around making decisions in the cloud. “The whole process of trying to get control of this begins with the idea of visibility into what the spend is, and that means you have to have an understanding of how to report against it, how to apply the tooling to do things like anomaly alerting,” he says. I expect the discussion to be less about whether there should be repatriation, and the more constructive discussion to be about the ways to think about how to keep the balance right.”  The overall goal of the panel is to present a process for analyzing workloads. And according to Hoyer, that’s not a one-time process—it’s iterative. “I’ll encourage anyone who has hybrid scenarios—some in the data center and some in the cloud—to be doing iterated looks at that to see what workloads should still be in the cloud,” Hoyer says. “There should be an iteration: Here’s what’s in the cloud today, here’s what’s in the data center today, and in broad terms, are these the right workloads? And then also, when stuff is in the cloud, are we operating it efficiently? And that’s a constant process, because you’ll have workloads that grow from the size they were in the cloud. And we’ll hear that same evaluation from the technology standpoint—are we using the best products in the cloud, and are there things in the data center that ought not to be there?”Be sure to join C2C Global, SADA, Twitter, and Etsy for this important conversation and arm your business with the tools needed to make intelligent and informed decisions about running your workloads and scaling your business. Click the link below to register. 

Categories:Google Cloud StrategyHybrid and MulticloudCloud MigrationStorage and Data TransferInterview

Ingest, Store, Query, and More: What BigQuery Can Do for You

If you’re a web developer, a software engineer, or anyone else working with small batches of data, you know how to use a spreadsheet. The problem arises when you have massive amounts of data that need to be stored, ingested, analyzed, and visualized rapidly. More often than not, the product you need to solve this problem is Google Cloud’s serverless, fully-managed service, BigQuery. BigQuery deals with megabytes, terabytes, and petabytes of information, helping you store, ingest, stream, and analyze those massive troves of information in seconds.Small stores can use Excel to classify, analyze and visualize their data. What if your organization is a busy multinational corporation with branches across cities and regions? You need a magical warehouse database you can use to store, sort, and analyze streams of incoming information. That’s where BigQuery comes in. What is BigQuery? BigQuery is Google Cloud’s enterprise data cloud warehouse built to process read-only data. It’s fully managed, which means you don’t need to set up or install anything, nor do you need a data-based administrator. All you need to do is import and analyze your data.To communicate with BigQuery, you need to know SQL (Structured Query Language), the standard language for relational databases, used for tasks such as updating, editing or retrieving data from a database. BigQuery in Action BigQuery executes three primary actions: Ingestion: uploading data by ingesting it from cloud storage or by streaming it live from Google Cloud partners, such as BigTable, Cloud Storage, Cloud SQL, and Google Drive, enabling real-time insights Storage: storing data in a structured table, using SQL for easy query and data analysis Querying: answering questions about data in BigQuery with SQL Getting BigQuery up and running is fairly simple. Just follow these steps: Find BigQuery on the left-side menu of the Google Cloud Platform Console, under “Resources.” Choose one or more of these three options: Load your own data into BigQuery to analyze (and convert that data batch into a common format such as CSV, Parquet, ORC, Avro, or JSON). Use any of the free public datasets hosted by Google Cloud (e.g., the Coronavirus Data in the European Union Open Data Portal). Import your data from an external data source.  BigQuery ML You can also use BigQuery for your machine learning models. You can train and execute your models on BigQuery data without needing to train and move them around. To get started using BigQuery ML, see Getting started with BigQuery ML using the Cloud Console.Where can you find BigQuery (and BigQuery ML)? Both BigQuery and BigQuery ML are accessible via: Google Cloud Console The BigQuery command-line tool The BigQuery REST API An external tool such as a Jupyter notebook or a business intelligence platform  BigQuery Data Visualization When the time comes to visualize your data, BigQuery can integrate with several business intelligence tools such as Looker, Tableau, and Data Studio to help you turn complex data into compelling stories. BigQuery in Practice Depending on your company’s needs, you will want to take advantage of different capabilities of BigQuery for different purposes. Use cases for BigQuery include the following: Real-time fraud detection: BigQuery ingests and analyzes massive amounts of data in real time to identify or prevent unauthorized financial activity. Real-time analytics: BigQuery is immensely useful for businesses or organizations that need to analyze their latest business data as they compile it. Log analysis: BigQuery reviews, interprets, and understands all computer-generated log files. Complex data pipeline processing: BigQuery manages and interprets the steps of one or multiple complex data pipelines generated by source systems or applications.  Best BigQuery Features BigQuery has a lot to offer. Here are some of the tools BigQuery’s platform includes: Real-time analytics that analyzes data on the spot. Logical data warehouses wherein you can process data from external sources, either in BigQuery itself or in Google Drive. Data transfer services where you can import data from external sources including: Google Marketing Platform Google Ads YouTube Partner SaaS applications to BigQuery Teradata Amazon S3 Storage compute separation, an option that allows you to choose the storage and processing solution that’s best for your project Automatic backup and easy restore, so you don’t lose your information. BigQuery also keeps a seven-day history of changes.  BigQuery Pros  It’s fast. BigQuery processes billions of data rows in seconds. It’s easy to set up and simple to use; all you need to do is load your data. BigQuery also integrates easily with other data management solutions like Data Studio and Google Analytics BigQuery is the only data warehouse that handles huge amounts of data. BigQuery gives you real-time feedback that could thwart potential business problems. With BigQuery, you can avoid data silo complications that arise when you have individual teams within your company that have their own data marts.   BigQuery Cons  It falls short when used for constantly changing information. It only works on Google Cloud. It can become costly as data storage and query costs accumulate. PCMag suggests you go for flat pricing to reduce costs. You need to know SQL and its particular technical habits to use BigQuery. BigQuery ML can only be used in the US, Asia, and Europe.  When should you use BigQuery? BigQuery is best used ad-hoc for massive amounts of data, run for longer than five seconds, that you want analyzed in real time. The more complex the query, the more you’ll benefit from BigQuery. At the same time, don’t expect the tool to be used as a regular relational database or for CRUD, i.e., to Create, Read, Update, and Delete data. BigQuery Costs Multiple costs come with using BigQuery. Here is a breakdown of what you will pay for when you use it: Storage (based on how much data you store): There are two storage rates: active storage ($0.020 per GB), or long-term storage ($0.010 per GB). With both, the first ten GB are free each month. Processing queries: Query costs are either on-demand (i.e., by the amount of data processed per query), or flat-rate. BigQuery also charges for certain other operations, such as streaming results and use of the BigQuery Storage API. Loading and exporting data is free.For details, see Data ingestion pricing. This Coupler Guide to BigQuery Cost is also extremely helpful. TL;DR: With BigQuery, you can assign read or write permissions to specific users, groups or projects, collaborating across teams, and it is thoroughly secure, since it automatically encrypts at-rest and transit data.If you’re a data scientist or web developer running ML or data mining operations, BigQuery may be your best solution for those spiky, massive workloads. It is also useful for anyone handling bloated data batches, within reason. Be wary of those costs. Have you ever used BigQuery? How do you use it? Reach out and tell us about your experience! Extra Credit:  

Categories:Data AnalyticsStorage and Data TransferGoogle Cloud Product Updates

Get to Know the Google Cloud Data Engineer Certification

Personal development and professional development are among the hottest topics within our community. At C2C, we’re passionate about helping Google Cloud users grow in their careers. This article is part of a larger collection of Google Cloud certification path resources.The Google Cloud Professional Data Engineer certification covers highly technical knowledge concerning how to build scalable, reliable data pipelines and applications. Anyone who intends to take this exam should also be comfortable selecting, monitoring, and troubleshooting machine learning models.In 2021, the Professional Data Engineer rose to number one on the top-paying cloud certifications list, surpassing the Professional Cloud Architect, which had held that spot for the two years prior. According to the Dice 2020 Tech Job Report, it’s one of the quickest growing IT professions, and even with an influx of people chasing that role, the supply can’t meet the demand. More than ever, businesses are driven to take advantage of advanced analytics; data engineers design and operationalize the infrastructure to make that possible.Before you sit at a test facility for the real deal, we highly recommend that you practice with the example questions (provided by Google Cloud) with Google Cloud’s documentation handy. All the questions are scenario-based and incredibly nuanced, so lean in to honing your reading comprehension skills and verifying your options using the documentation.We’ve linked out to plenty of external resources for when you decide to commit and study, but let’s start just below with questions like:What experience should I have before taking this exam? What roles and job titles does Google Cloud Professional Data Engineer certification best prepare me for? Which topics do I need to brush up on before taking the exam? Where can I find resources and study guides for Google Cloud Professional Data Engineer certification? Where can I connect with fellow community members to get my questions answered? View image as a full-scale PDF here.  Looking for information about a different Google Cloud certification? Check out the directory in the Google Cloud Certifications Overview. Extra CreditGoogle Cloud’s certification page: Professional Data Engineer Example questions Exam guide Coursera: Preparing for Google Cloud Certification: Cloud Data Engineer Professional Certification Pluralsight: Preparing for the Google Cloud Professional Data Engineer Exam AwesomeGCP Associate Cloud Engineer Playlist Global Knowledge IT Skills and Salary Report 2020 Global Knowledge 2021 Top-Paying IT CertificationsHave more questions? We’re sure you do! Career growth is a hot topic within our community and we have quite a few members who meet regularly in our C2C Connect: Certifications chat. Sign up below to stay in the loop.https://community.c2cglobal.com/events/c2c-connect-google-cloud-certifications-72

Categories:Data AnalyticsCareers in CloudStorage and Data TransferGoogle Cloud CertificationsDatabasesInfographic

C2C Talks: Using Google Cloud’s BigQuery to Move from a 48-Hour Cycle Time to a Mere 7 Minutes

Author’s Note: C2C Talks are an opportunity for C2C members to engage through shared experiences and lessons learned. Often there is a short presentation followed by an open discussion to determine best practices and key takeaways.Juan Carlos Escalante (JC) is a pioneering member of C2C and a vital part of the CTO office at Ipsos. Escalante details how he and his team handled data migration powered by Google Cloud and shares his current challenges, which may not be unlike anything you’re also facing. As a global leader in market research, Ipsos has offices in 90 countries and conducts research in more than 150 countries. So, to say its data architecture is challenging barely covers the complexity JC manages each day. “Our data architecture on our data pipeline challenges gets complex very quickly, especially for workloads dealing with multiple data sources, and what I describe as hyper-fragmented data delivery requirements,” he said in a recent C2C Talks: Data Migration and Modernization on December 10, 2020.So, how do they manage a seamless data flow? And how does JC’s data infrastructure landscape look? Hear below.What was the primary challenge?  Even though the design JC described is popular and widely used in the space, it isn’t without its own set of challenges and siloed data infrastructure rises to the top.“The resilience of siloed data infrastructure platforms that we see scattered across the company translates to longer cycle times and more friction to pivot and react to changing business requirements,” he said. Hear JC explain the full challenge below. What resonates with you? Share it with us!  How did you use Google Cloud as a solution?  By leveraging Google Cloud, JC and his team have unlocked new opportunities to simplify how different groups come into a data infrastructure platform and serve or solve their specific needs.“We all have different products and services that we have available within Google Cloud Platform,” he said. “Very quickly, we've been able to test and deploy proofs of concept that have moved rapidly towards production.”Some examples of the benefits JC and his team have found by using the Google Cloud Platform product, BigQuery include: Reduced cycle time or processing time from 48 hours to seven minutes Data harmony across teamsHear JC explain how BigQuery helped reach these successful milestones. Since it's going so well, what's next?  The goal is to think bigger and determine how JC and his team can transform end-to-end data platform architecture. “The next step we want to take in our data architecture journey is to bring design patterns that are common and are used widely in software development and bringing those patterns into our data engineering practices,” he said. On that list is version control for data pipelines—hear JC explain why. Also, JC is working with his team to plan for the future of data architecture and analytics on a global scale, which he says will be a multi-cloud environment. Hear him explain why below. Questions from the C2C Community 1. Are the business analysts running their daily job through the BigQuery interface? Or do they use a different application that's pulling from BigQuery?For JC’s organization, some teams got up to speed very quickly, while others need a little more coaching, so they’ll be putting together some custom development with Tableau. Hear JC’s full answer below. Hear how they use Google Sheets to manage the data exported from Big Query. 2. I have the feeling that my databases are way more similar than yours because my database is not talking about those things. It's just a handful of tables. So it's easier for us to monitor a handful of tables. But how do you monitor triggers?This question led to a more in-depth discussion, so JC offered to set up a time to discuss further separately, which is just one of the beautiful benefits of being a part of the C2C community. Check out what JC said to attack the question with some clarity below. We’ll update you with their progress as it becomes available! 3. What data visualization tools do JC and his team use?“Basically, the answer is we're using everything under the sun. We do have some Spotfire footprint, we have Tableau, we have Looker, and we have ClixSense. We also have custom development visualization developments,” he said.“My team is gravitating more towards Tableau, but we have to be mindful that whatever data architecture design we come up with, it has to be decoupled, flexible, and it has to be data engine and data visualization agnostic because we do get a request to support the next visualization,” he warned. Hear about how JC manages the overlap with Looker and Tableau and why he likes them both.   Extra Credit JC and his team used the two articles from Thoughtworks, linked below, to inform their decision-making and what they used as a guide for modernizing their data architecture. He recommends checking them out. How to Move Beyond a Monolithic Data Lake to a Distributed Data Mesh by Zhamak Dehghani, Thoughtworks, May 2019 Data Mesh Principles and Logical Architecture by Zhamak Dehghani, Thoughtworks, December 2020 We want to hear from you! There is so much more to discuss, so connect with us and share your Google Cloud story. You might get featured in the next installment! Get in touch with Content Manager Sabina Bhasin at sabina.bhasin@c2cgobal.com if you’re interested.Rather chat with your peers? Join our C2C Connect chat rooms! Reach out to Director of Community Danny Pancratz at danny.pancratz@c2cglobal.com. 

Categories:Data AnalyticsC2C Community SpotlightHybrid and MulticloudCloud MigrationStorage and Data TransferDatabasesSession Recording

5 Takeaways from C2C's Conversation with Andi Gutmans: Reducing Complexities in Your Data Ecosystem

This article was originally published on November 20, 2020.Hailed as one of the “Founding Fathers” of the internet for co-creating PHP, Andi Gutmans is just getting started. To discuss his new role at Google and the future of data, Gutmans joins C2C for a discussion in our sixth installment of our thought leadership series where we don’t hold back on both the fun and challenging questions. As a four-citizenship-holding and engineering powerhouse, Gutmans brings a global perspective to both tech and coffee creation.“I love making espresso and improving my latte art,” he mused. “I always say, if tech doesn’t work out for me, that’s where you’re going to find me.But, when he isn’t daydreaming about turning it all in to own a coffee shop and become a barista, he leads the operational database group as the GM and VP of engineering and databases at Google.“Our goal is building a strategy and vision that is very closely aligned with what our customers need,” he said. “Then, my organization works with customers to define what that road map looks like, deliver that, and then operate the most scalable, reliable, and secure service in the cloud.”It’s an enormous responsibility, but Gutmans and his team met the challenge to three steps: migration, modernization, and transformation. They accomplished this, even though they’ve never met in person—Gutmans started working at Google during the COVID-19 pandemic.Driven to support customers through their data journeys as they move to the cloud and transform their business, he digs into the how, the why, and more during the conversation, video above, but these are the five points you should know:Lift, Shift, TransformThe pandemic has changed the way everyone is doing business. For some, the change comes with accelerating the shift to the cloud, but Gutmans said most customers are taking a three-step journey into the cloud.“We’re seeing customers embrace this journey into the cloud,” he said. "They’re taking a three-step journey into the cloud. Migration, which is trying to lift and shift as quickly as possible, getting out of their data center. Then modernizing their workloads, taking more advantage of some of the cloud capabilities, and then completely transforming their business.”Migrating to the cloud allows customers to spend less time managing infrastructure and more time on innovating business problems. To keep the journey frictionless for customers, he and his team are working on a service called Cloud SQL. The service is a managed MySQL, PostgreSQL, and SQL server, for clarity. They also handle any regulatory requirements customers have in various geographies.“By handling the heavy lifting for customers, they have more bandwidth for innovation,” he said. “So the focus for us is making sure we’re building the most reliable service, the most secure service, and the most scalable service.”Gutmans described how Autotrader lifted and shifted into Google’s cloud SQL service and was able to increase deployment velocity by 140% year-over-year, he said. “So, there is an instant gratification aspect of moving into the cloud.”Another benefit of the cloud is auto-remediation, backups, and restoration. Still, the challenge is determining what stays to the edge and what goes into the cloud, and, of course, security. Gutmans said he wants to work with customers and understand their pain points and thought processes better.Modernizing sometimes requires moving customers off proprietary vendors and open-source-based databases, but the Gutmans team has a plan for that. By investing in partners, they can provide customers with assessments of their databases, more flexibility, and a cost reduction.Finally, when it comes to transformation, the pandemic has redefined the scope. A virtual-focused world is reshaping how customers are doing business, so that’s where a lot of Google’s cloud-native database investments have come in, such as Cloud Spanner, Cloud, BigQuery, and Firestore.“It's really exciting to see our customers make that journey,” he said. “Those kinds of transformative examples where we innovate, making scalability seamless, making systems that are reliable, making them globally accessible, we get to help customers, you know, build for [their] future,” he said. “And seeing those events be completely uneventful from an operational perspective is probably the most gratifying piece of innovating.”Gutmans adds that transformation isn’t limited to customers that have legacy data systems. Cloud-native companies may also need to re-architect, and Google can support those transformations, too.AI Is MaturingGartner stated that by 2022, 75% of all databases would be in the cloud, and that isn’t just because of the pandemic accelerating transformation. Instead, AI is maturing, and it is allowing companies to make intelligent, data-driven decisions.“It has always been an exciting space, but I think today is more exciting than ever,” Gutmans said. “In every industry right now, we’re seeing leaders emerge that have taken a digital-first approach, so it’s caused the rest of the industries to rethink their businesses.”Data Is Only Trustworthy if It’s SecureData is quickly becoming the most valuable asset organizations have. It can help make better business decisions and help you better understand your customer and what’s happening in your supply chain. Also, analyzing your data and leveraging historicals can help improve forecasting to better target specific audiences.But with all the tools improving data accessibility and portability, security is always a huge concern. But Gutmans’ team is also dedicated to keeping security at the fore.“We put a lot of emphasis on security—we make sure our customer’s data is always encrypted by default,” he said.Not only is the data encrypted, but there are tools available to decrypt with ease.“We want to make sure that not only can the data come up, [but] we also want to make it easy for customers to take the data wherever they need it,” Gutmans said.Even with the support through the tools Gutmans’ team is working to provide customers, the customer is central, and they have all the control.“We do everything we can to ensure that only customers can govern their data in the best possible way; we also make sure to give customers tight control,” he said.As security measures increase, new data applications are emerging, including fraud detection and the convergence of operational data and analytical systems. This intersection creates powerful marketing applications, leading to improved customer experience.“There are a lot of ways you can use data to create new capabilities in your business that can help drive opportunity and reduce risk,” Gutmans said.Leverage APIs Without Adding Complexity There are two kinds of APIs, as Gutmans sees it: administration API and then API for building applications.On the provisioning side, customers can leverage the DevOps culture and automate their test staging and production environments. On the application side, Gutmans suggests using the DevOps trend of automating infrastructure as code. He points to resources available here and here to provide background on how to do this.But when it comes to applications, his answer is more concise, “if the API doesn’t reduce complexity, then don’t use them.”“I don’t subscribe to the philosophy where, like, everything has to be an API, and if not...you’re making a mistake,” he added.He recommends focusing on where you can gain the most significant agility benefit to help your business get the job done.Final Words of WisdomGutmans paused and went back to the importance of teamwork and collaboration and offered this piece of advice:“Don’t treat people the way you want to be treated; treat people the way they want to be treated.”He also added that the journey is different for each customer. Just remember to “get your data strategy right.”

Categories:AI and Machine LearningData AnalyticsGoogle Cloud StrategyCareers in CloudAPI ManagementCloud MigrationIdentity and SecurityStorage and Data Transfer