Notes taken by Horeb S.
This week dives deep into the principles and applications of data warehousing with a hands-on focus on Google BigQuery. The main topics and learning objectives include:
Key resources
🔗 Slides
🔗 Some basic SQL Big Query SQL
🔗 My Github Repo
Contents
OLTP (Online Transaction Processing) and OLAP (Online Analytical Processing) are two different types of database systems. OLTP is designed for handling real-time business operations and transactions, focusing on quick updates and maintaining data consistency. OLAP, on the other hand, is optimized for complex queries and data analysis, typically used in data warehouses to process large volumes of historical data for business intelligence. Here is a table that summaries the main differences between both concepts :
OLTP | OLAP | |
---|---|---|
Purpose | Control and run essential business operations in real time | Plan, solve problems, support decisions, discover hidden insights |
Data updates | Short, fast updates initiated by user | Data periodically refreshed with scheduled, long-running batch jobs |
Database design | Normalized databases for efficiency | Denormalized databases for analysis |
Space requirements | Generally small if historical data is archived | Generally large due to aggregating large datasets |
Backup and recovery | Regular backups required to ensure business continuity and meet legal and governance requirements | Lost data can be reloaded from OLTP database as needed in lieu of regular backups |
Productivity | Increases productivity of end users | Increases productivity of business managers, data analysts, and executives |
Data view | Lists day-to-day business transactions | Multi-dimensional view of enterprise data |
User examples | Customer-facing personnel, clerks, online shoppers | Knowledge workers such as data analysts, business analysts, and executives |
A data warehouse (DW) is a centralized repository that stores large amounts of structured data from various sources for the purpose of query and analysis. It serves as the core component of business intelligence infrastructure, enabling organizations to consolidate data from multiple operational systems into a single, consistent data store. Unlike traditional OLTP databases, data warehouses are specifically designed for OLAP operations, allowing for complex analytical queries across historical data to support data-driven decision making.