Notes taken by Horeb S.

This week dives deep into the principles and applications of data warehousing with a hands-on focus on Google BigQuery. The main topics and learning objectives include:

Key resources

🔗 Slides

🔗 Some basic SQL Big Query SQL

🔗 My Github Repo

Contents

OLAP vs OLTP

OLTP (Online Transaction Processing) and OLAP (Online Analytical Processing) are two different types of database systems. OLTP is designed for handling real-time business operations and transactions, focusing on quick updates and maintaining data consistency. OLAP, on the other hand, is optimized for complex queries and data analysis, typically used in data warehouses to process large volumes of historical data for business intelligence. Here is a table that summaries the main differences between both concepts :

OLTP OLAP
Purpose Control and run essential business operations in real time Plan, solve problems, support decisions, discover hidden insights
Data updates Short, fast updates initiated by user Data periodically refreshed with scheduled, long-running batch jobs
Database design Normalized databases for efficiency Denormalized databases for analysis
Space requirements Generally small if historical data is archived Generally large due to aggregating large datasets
Backup and recovery Regular backups required to ensure business continuity and meet legal and governance requirements Lost data can be reloaded from OLTP database as needed in lieu of regular backups
Productivity Increases productivity of end users Increases productivity of business managers, data analysts, and executives
Data view Lists day-to-day business transactions Multi-dimensional view of enterprise data
User examples Customer-facing personnel, clerks, online shoppers Knowledge workers such as data analysts, business analysts, and executives

What is data warehouse

A data warehouse (DW) is a centralized repository that stores large amounts of structured data from various sources for the purpose of query and analysis. It serves as the core component of business intelligence infrastructure, enabling organizations to consolidate data from multiple operational systems into a single, consistent data store. Unlike traditional OLTP databases, data warehouses are specifically designed for OLAP operations, allowing for complex analytical queries across historical data to support data-driven decision making.

image.png