Logging, Monitoring, and Observability in Google Cloud

The best course to monitor, troubleshoot and improve your infrastructures and applications performances

GCP200LMO v1.1
3 days (21 hours)

Course overview

This three-day instructor-led course teaches participants techniques for monitoring, troubleshooting, and improving infrastructure and application performance in Google Cloud. Guided by the principles of Site Reliability Engineering (SRE), and using a combination of presentations, demos, hands-on labs, and real-world case studies, attendees gain experience with full-stack monitoring, real-time log management and analysis, debugging code in production, tracing application performance bottlenecks, and profiling CPU and memory usage.

Learning outcomes

This course teaches participants the following skills:

  • Plan and implement a well-architected logging and monitoring infrastructure
  • Define Service Level Indicators (SLIs) and Service Level Objectives (SLOs)
  • Create effective monitoring dashboards and alerts
  • Monitor, troubleshoot, and improve Google Cloud infrastructure
  • Analyze and export Google Cloud audit logs
  • Find production code defects, identify bottlenecks, and improve performance
  • Optimize monitoring costs

Prerequisites

To get the most out of this course, participants should have:

  • Google Cloud Platform Fundamentals: Core Infrastructure or equivalent experience
  • Basic scripting or coding familiarity
  • Proficiency with command-line tools and Linux operating system environments

Target audience

This course is intended for the following participants:

  • Cloud architects, administrators, and SysOps personnel
  • Cloud developers and DevOps personnel

Course Outline

Module 1: Introduction to Google Cloud Monitoring Tools

  • Understand the purpose and capabilities of Google Cloud operations-focused components: Logging, Monitoring, Error Reporting, and Service Monitoring
  • Understand the purpose and capabilities of Google Cloud application performance management focused components: Debugger, Trace, and Profiler

Module 2: Avoiding Customer Pain

  • Construct a monitoring base on the four golden signals: latency, traffic, errors, and saturation
  • Measure customer pain with SLIs
  • Define critical performance measures
  • Create and use SLOs and SLAs
  • Achieve developer and operation harmony with error budgets

Module 3: Monitoring Critical Systems

  • Choose best practice monitoring project architectures
  • Differentiate Cloud IAM roles for monitoring
  • Use the default dashboards appropriately
  • Build custom dashboards to show resource consumption and application load
  • Define uptime checks to track aliveness and latency

Module 4: Alerting Policies

  • Develop alerting strategies
  • Define alerting policies
  • Add notification channels
  • Identify types of alerts and common uses for each
  • Construct and alert on resource groups
  • Manage alerting policies programmatically

Module 5: Advanced Logging and Analysis

  • Identify and choose among resource tagging approaches
  • Define log sinks (inclusion filters) and exclusion filters
  • Create metrics based on logs
  • Define custom metrics
  • Link application errors to Logging using Error Reporting
  • Export logs to BigQuery

Module 6: Working with Audit Logs

  • Audit Logs
  • Data Access Logging
  • Audit Logs Entry Format
  • Best Practices

Module 7: Configuring Google Cloud Services for Observability

  • Integrate logging and monitoring agents into Compute Engine VMs and images
  • Enable and utilize Kubernetes Monitoring
  • Extend and clarify Kubernetes monitoring with Prometheus
  • Expose custom metrics through code, and with the help of OpenCensus

Module 8: Monitoring Google Cloud VPC

  • Collect and analyze VPC Flow logs and Firewall Rules logs
  • Enable and monitor Packet Mirroring
  • Explain the capabilities of Network Intelligence Center
  • Use Admin Activity audit logs to track changes to the configuration or metadata of resources
  • Use Data Access audit logs to track accesses or changes to user-provided resource data
  • Use System Event audit logs to track GCP administrative actions

Module 9: Managing Incidents

  • Define incident management roles and communication channels
  • Mitigate incident impact
  • Troubleshoot root causes
  • Resolve incidents
  • Document incidents in a post-mortem process

Module 10: Investigating Application Performance Issues

  • Debug production code to correct code defects
  • Trace latency through layers of service interaction to eliminate performance bottlenecks
  • Profile and identify resource-intensive functions in an application

Module 11: Optimizing the Costs of Monitoring

  • Analyze resource utilization cust for monitoring related components within Google Cloud
  • Implement best practices for controlling the cost of monitoring within Google Cloud

€2100 ex. VAT

Suggested courses

GCP300ANT
Architecting Hybrid Cloud Infrastructure with Anthos
This two-day instructor-led course prepares students to modernize, manage, and observe their applications using Kubernetes whether the application is deployed on-premises or on Google Cloud Platform (GCP). Through presentations, and hands-on labs, participants explore and deploy using Kubernetes Engine (GKE), GKE Connect, Istio service mesh and Anthos Config Management capabilities that enable operators to work with modern applications even when split among multiple clusters hosted by multiple providers, or on-premises.
GCP300A
Architecting with Google Cloud Platform: Design and Process
Ce cours de deux jours dirigé par un instructeur permet aux étudiants de créer des solutions hautement fiables et efficaces sur Google Cloud Platform, en utilisant des modèles de conception éprouvés et les principes de Google Site Reliability Engineering (SRE). Il s'agit d'une continuation du cours Architecting with Google Cloud Platform: Infrastructure et suppose une expérience pratique des technologies couvertes par ce cours. Grâce à une combinaison de présentations, de démonstrations et de travaux pratiques, les participants apprennent à concevoir des déploiements GCP hautement fiables et sécurisés; et comment exploiter les déploiements GCP de manière hautement disponible et rentable.
GCP200AGCE
Architecting with Google Compute Engine
This three-day instructor-led class introduces participants to the comprehensive and flexible infrastructure and platform services provided by Google Cloud, with a focus on Compute Engine. Through a combination of presentations, demos, and hands-on labs, participants explore and deploy solution elements, including infrastructure components such as networks, systems, and application services. This course also covers deploying practical solutions including securely interconnecting networks, customer-supplied encryption keys, security and access management, quotas and billing, and resource monitoring.
GCP200AGKE
Architecting with Google Kubernetes Engine
Learn how to deploy and manage containerized applications on Google Kubernetes Engine (GKE). Learn how to use other tools on Google Cloud that interact with GKE deployments. This course features a combination of lectures, demos, and hands-on labs to help you explore and deploy solution elements—including infrastructure components like pods, containers, deployments, and services—along with networks and application services. You'll also learn how to deploy practical solutions, including security and access management, resource management, and resource monitoring.
GCP200DEV
Developing Applications with Google Cloud Platform
In this course, application developers learn how to design, develop, and deploy applications that seamlessly integrate components from the Google Cloud ecosystem. Through a combination of presentations, demos, and hands-on labs, participants learn how to use GCP services and pre-trained machine learning APIs to build secure, scalable, and intelligent cloud-native applications.
GCP100A
Google Cloud Platform Fundamentals: Core Infrastructure
This one-day instructor-led class provides an overview of Google Cloud Platform products and services. Through a combination of presentations, demos, and hands-on labs, participants learn the value of Google Cloud Platform and how to incorporate cloud-based solutions into business strategies.

Contact us

You can unsubscribe from our communications at any time.

In order to take into account your request, we must store and process your personal data. If you authorize us to store your personal data for this purpose, check the box below.

By clicking on « Send » below, you authorize SFEIR to store and process the personal data submitted above so that it can provide you with the requested content.