Fact Table
A fact table is a database table in a star or snowflake schema that stores measures (quantitative data) and foreign keys to dimensions, representing events or transactions in a business process.
Fact tables are the central structure in dimensional modeling, recording the occurrence of business events. A sales fact table has one row per transaction: product sold, customer, store, date, revenue, quantity. A web analytics fact table has one row per page view: user, page, time, duration. Each row is an atomic, measurable occurrence.
Fact tables contain two types of columns: measures (numeric values to aggregate) and foreign keys (references to dimensions). Measures answer "how much" questions (revenue, units sold, page duration). Foreign keys enable "drill-down" analysis (by product, region, time). Good fact table design separates what was measured (facts) from how to analyze (dimensions).
Fact tables are typically very large, containing billions of rows for years of operational data. Indexing is critical: foreign keys need indexes for joins, and date indexes enable time-based partitioning. Columnar storage excels for fact tables, reading only relevant measures and dimensions from disk. Modern data warehouses use fact tables as the foundation for analytics, with aggregated versions (hourly, daily) for performance.
Key Characteristics
- ▶Store measures (numeric, aggregatable values) for business events
- ▶Include foreign keys to dimension tables
- ▶One row per atomic business event or transaction
- ▶Often very large (billions to trillions of rows)
- ▶Indexed on foreign keys for efficient joins
- ▶Partitioned by date or other time dimension
Why It Matters
- ▶Enable aggregation of business metrics across any dimension
- ▶Provide foundation for analytical queries and reporting
- ▶Support efficient analysis of large event streams
- ▶Enable tracking of business process metrics over time
- ▶Facilitate drill-down analysis from summary to detail
- ▶Support real-time and historical analytics on event data
Example
` -- Fact table: Sales transactions CREATE TABLE fct_sales ( -- Surrogate key (optional but common) sale_id BIGINT PRIMARY KEY, -- Foreign keys to dimensions date_id INT NOT NULL, product_id INT NOT NULL, customer_id INT NOT NULL, store_id INT NOT NULL, -- Measures (numeric, aggregatable values) revenue DECIMAL(10, 2), quantity INT, discount_amount DECIMAL(10, 2), tax_amount DECIMAL(10, 2), -- Optional: Slowly changing dimension keys for history product_key INT, -- Current product reference product_history_key INT -- Historical product version -- Indexes for joins FOREIGN KEY (date_id) REFERENCES dim_date, FOREIGN KEY (product_id) REFERENCES dim_product, FOREIGN KEY (customer_id) REFERENCES dim_customer, FOREIGN KEY (store_id) REFERENCES dim_store ); CREATE INDEX idx_fct_sales_date ON fct_sales(date_id); CREATE INDEX idx_fct_sales_product ON fct_sales(product_id); CREATE INDEX idx_fct_sales_customer ON fct_sales(customer_id); CREATE INDEX idx_fct_sales_store ON fct_sales(store_id); -- Analytical queries aggregate fact measures by dimension attributes -- Query 1: Total revenue by product SELECT p.category, p.product_name, SUM(f.revenue) as total_revenue, SUM(f.quantity) as units_sold FROM fct_sales f JOIN dim_product p ON f.product_id = p.product_id GROUP BY p.category, p.product_name ORDER BY total_revenue DESC; -- Query 2: Revenue trend over time SELECT d.year, d.month, SUM(f.revenue) as monthly_revenue FROM fct_sales f JOIN dim_date d ON f.date_id = d.date_id GROUP BY d.year, d.month ORDER BY d.year, d.month; -- Query 3: Multi-dimensional analysis SELECT d.year, c.region, p.category, SUM(f.revenue) as region_category_revenue, COUNT(*) as transaction_count FROM fct_sales f JOIN dim_date d ON f.date_id = d.date_id JOIN dim_customer c ON f.customer_id = c.customer_id JOIN dim_product p ON f.product_id = p.product_id WHERE d.year = 2024 GROUP BY d.year, c.region, p.category; `
Coginiti Perspective
Coginiti models fact tables as SMDL entities with measures that specify aggregation types such as sum, count_distinct, avg, or custom expressions. The MEASURE() function in Semantic SQL handles correct aggregation at any grain, so queries against fact tables produce accurate results even when analysts group by different dimension combinations. Incremental publication strategies (append, merge, merge_conditionally) allow CoginitiScript pipelines to update fact tables efficiently without full reloads, which matters given the volume of rows fact tables typically accumulate.
Related Concepts
More in OLTP, OLAP & Workload Types
Analytical Workload
An analytical workload is a class of database queries that examine, aggregate, and analyze large volumes of historical data to extract business insights and support decision-making.
Conformed Dimensions
Conformed dimensions are dimensions that mean the same thing—and carry the same keys, attributes, and values—across multiple fact tables or data marts, letting you combine and compare metrics from different business processes with confidence.
Dimension Table
A dimension table is a database table in a star or snowflake schema that stores descriptive attributes used to filter, group, and drill-down in analytical queries.
Fan and Chasm Traps
Fan traps and chasm traps are two classic join hazards in dimensional and relational models where the structure of the joins causes measures to be over-counted or under-counted, producing numbers that look valid but are wrong.
Grain
Grain is the level of detail represented by a single row in a table—the precise answer to the question "what does one record mean?"—and it is the foundational decision in dimensional modeling because every dimension and measure in a table must be consistent with it.
HTAP (Hybrid Transactional/Analytical Processing)
HTAP is a database architecture that supports both transactional workloads and analytical workloads on the same data system, enabling real-time analytics without separate data warehouses.
See Semantic Intelligence in Action
Coginiti operationalizes business meaning across your entire data estate.