Database tuning in HANA

Database tuning in SAP HANA is a multifaceted discipline focused on maximizing the performance, efficiency, and stability of the in-memory database. Unlike traditional disk-based databases, HANA's primary performance drivers are memory utilization, CPU parallelization, and optimized data access patterns. Disk I/O is critical for persistence and recovery, but not for core transactional or analytical processing.

Database Tuning in SAP HANA

Database tuning in SAP HANA involves optimizing various layers, from hardware and OS to SQL queries and data models.

I. Hardware and Operating System Tuning

While not directly "database" tuning, ensuring the underlying infrastructure is optimized is foundational for HANA's performance.

Memory:
- Adequate Sizing: The most crucial aspect. HANA needs enough RAM to hold its active data. Undersizing leads to excessive data unloading/reloading or even OOM errors.
- NUMA Alignment: Ensure HANA processes and data are aligned with Non-Uniform Memory Access (NUMA) nodes to minimize cross-socket memory access latency.
- Huge Pages (Linux): Configure Linux Huge Pages to improve memory management performance and reduce TLB (Translation Lookaside Buffer) miss rates.
- Swap Space: Keep swap space minimal, as swapping (paging to disk) is a performance killer for an in-memory database. Typically, set vm.swappiness=1 (or even 0 if fully confident in RAM sizing) and vm.overcommit_memory=2.
CPU:
- High Clock Speed: HANA benefits significantly from high CPU clock speeds.
- Core Count: Sufficient cores are needed for parallel processing, but ensure logical CPUs (hyper-threading) are utilized efficiently.
- CPU Throttling: Ensure OS power management settings do not throttle CPU frequency. Set to "Performance" mode.
Disk I/O (for Persistence Layer):
- Log Volumes: Require extremely high-performance, low-latency storage (preferably NVMe or high-end SSDs) due to continuous synchronous writes.
- Data Volumes: Require good performance SSDs for savepoints, startup, and recovery.
- Separate Volumes: Keep data, log, and backup volumes on separate, dedicated file systems/LUNs to avoid I/O contention.
- Mount Options: Use appropriate mount options (e.g., noatime) to reduce unnecessary metadata updates.
- File System Choice: XFS or EXT4 (Linux) are commonly used.
Network:
- High Bandwidth/Low Latency: Crucial for scale-out systems (inter-node communication), system replication, and client-database communication. Use 10GbE or faster.
- Jumbo Frames: Consider enabling jumbo frames for large data transfers if supported by the network infrastructure.

II. SAP HANA Database Configuration Tuning

These are parameters within HANA's INI files (e.g., global.ini, indexserver.ini, statisticsserver.ini).

Memory Management:
- global.ini -> [memorymanager] -> global_allocation_limit:
  - Purpose: Sets the maximum memory HANA can allocate from the OS.
  - Tuning: Typically 85-90% of physical RAM for a dedicated HANA server. Adjust carefully to balance performance and OS stability. Too low leads to OOMs, too high can lead to OS paging.
- indexserver.ini -> [memorymanager] -> table_load_threshold:
  - Purpose: Controls when tables are pre-loaded into memory at startup.
  - Tuning: Adjust for critical tables to be loaded quickly, or for less critical ones to load on demand.
Workload Management (HANA 2.0 SPS01+):
- Workload Classes: Define and assign workload classes to users/applications/statements to prioritize and limit resource consumption (CPU, memory, threads, statement duration).
- Admission Control: Configure thresholds (e.g., queue_cpu_threshold, reject_memory_threshold) to queue or reject new requests during peak load, preventing system overload.
Persistence (Data & Log):
- global.ini -> [persistence] -> log_mode:
  - Purpose: Controls transaction log behavior.
  - Tuning: normal for production (requires log backups). overwrite only for non-production.
- global.ini -> [persistence] -> savepoint_interval_s:
  - Purpose: Frequency of automatic savepoints.
  - Tuning: Default (300s/5min) is usually good. Too frequent increases I/O, too infrequent increases recovery time.
- indexserver.ini -> [persistence] -> logbuffer_size:
  - Purpose: Size of the in-memory log buffer.
  - Tuning: Increase for very high transaction throughput (e.g., to 64MB or 128MB from default 32MB).
Tracing & Diagnostics:
- global.ini -> [trace] -> tracefile_maxsize / tracefile_maxbytes_disk:
  - Purpose: Controls size and retention of trace files.
  - Tuning: Prevent trace files from consuming excessive disk space. Set appropriate limits. Avoid overly verbose tracing in production unless actively troubleshooting.

III. Database Object and Data Model Tuning

Optimizing how data is stored and organized within HANA.

Data Modeling:
- Columnar Design: Design tables predominantly as column store. Use row store only for very small, highly transactional tables (e.g., configuration tables).
- Denormalization (Controlled): Judicious denormalization can reduce joins for analytical queries, but avoid excessive redundancy.
- Data Types: Use the smallest appropriate data types (e.g., TINYINT instead of BIGINT if the range allows) to save memory.
- Primary Keys: Always define primary keys; HANA uses them internally for efficient access.
Table Compression:
- Automatic Compression: HANA's column store automatically applies various compression algorithms (dictionary encoding, run-length encoding, etc.).
- Delta Merge: Ensure delta merges are running efficiently to move new/updated data from the uncompressed Delta Store to the compressed Main Store. Monitor M_DELTA_MERGE_STATISTICS.
- OPTIMIZE COMPRESSION: Manually trigger for specific tables if needed, especially after large data loads or for tables with suboptimal compression.
Partitioning:
- Purpose: Distributes large tables across multiple hosts (scale-out) or segments within a single host to improve parallelism and memory management.
- Strategies: Range, Hash, Round-robin, Multi-level (e.g., Hash-Range).
- Tuning: Choose partition key(s) that align with query filters to enable "partition pruning" (HANA only scans relevant partitions). Avoid too many small partitions (overhead) or too few very large ones (lack of parallelism).
Data Aging/Tiering:
- Purpose: Move less frequently accessed "cold" data from hot in-memory storage to colder, cheaper storage (e.g., disk, SAP IQ with Dynamic Tiering).
- Benefits: Frees up valuable main memory, reduces backup times, potentially improves hot data query performance.
- Tuning: Define aging policies based on data access patterns (e.g., by date, status).
Indexes (External):
- Use Sparingly: Secondary B-tree indexes are rarely needed on column-store tables and should only be created after detailed analysis (SQL Trace, Explain Plan) shows a specific bottleneck that can't be resolved otherwise.
- Unique Constraints: Automatically create internal B-tree indexes for efficiency and uniqueness checks.
- Full-Text & Spatial Indexes: Create only when their specific functionality is required, and be aware of their memory footprint and update mechanisms (synchronous vs. asynchronous).

IV. SQL Query and Application Tuning

Optimizing the actual workload executed on HANA.

SQL Query Optimization:
- Analyze Expensive Statements: Use HANA Cockpit's "Workload Analysis" or M_EXPENSIVE_STATEMENTS view to identify slow queries.
- Explain Plan/Plan Visualizer: Understand the execution plan of problematic queries. Look for full table scans, inefficient joins, or high processing times in specific operators.
- Code Pushdown: Maximize operations that can be processed directly within HANA's SQL/CE (Calculation Engine) rather than at the application layer.
- Avoid Implicit Type Conversions: Ensure data types match in join conditions and WHERE clauses.
- Use Prepared Statements: For frequently executed queries.
- Minimize Data Transfer: Only select necessary columns and rows.
- Proper Join Cardinality: For Calculation Views, ensure join cardinalities are correctly set to enable pruning.
- Result Cache: Consider using the static or dynamic result cache for frequently executed, unchanging queries.
Stored Procedures and Calculation Views:
- SQLScript: Write logic in SQLScript to leverage HANA's in-memory, parallel processing capabilities.
- Table Variables vs. Temporary Tables: Prefer table variables for better optimization.
- Scalar UDFs: Avoid excessively complex scalar User Defined Functions as they can limit pushdown.
- Avoid Cursors: Use set-based operations instead of row-by-row processing.
- Calculation View Optimization: Optimize joins, projections, aggregations, and enable pruning.
Application Layer:
- Minimize Round Trips: Bundle SQL statements into single calls.
- Batch Processing: Use array inserts/updates for bulk data operations.
- Error Handling: Implement robust error handling to prevent issues from cascading.

V. Monitoring Tools

Consistent monitoring is key to identifying and resolving performance issues.

SAP HANA Cockpit: Primary tool for overall system health, memory, CPU, disk, workload, alerts.
SQL Console / hdbsql: Execute system views (M_... views) for detailed insights.
SAP Solution Manager: Centralized monitoring for complex landscapes.
OS-level tools: top, free, df -h, iostat, netstat for deeper OS analysis.

30 Interview Questions and Answers (One-Liner) for Database Tuning in SAP HANA

Q: What is the most critical resource for HANA performance?
- A: Main memory (RAM).
Q: What is global_allocation_limit in HANA?
- A: The maximum amount of memory HANA can allocate from the operating system.
Q: What happens if global_allocation_limit is set too low?
- A: Out Of Memory (OOM) errors can occur.
Q: Why is minimizing swap space crucial for HANA?
- A: Swapping data to disk is very slow for an in-memory database.
Q: Which type of disk storage is highly recommended for HANA log volumes?
- A: NVMe or high-performance SSDs due to continuous synchronous writes.
Q: What is the primary purpose of partitioning large tables in HANA?
- A: To enable parallelism and efficient memory management, especially in scale-out.
Q: What is "partition pruning" in HANA?
- A: HANA only scans relevant partitions based on query filters.
Q: What is "Delta Merge" and why is it important for tuning?
- A: It moves data from Delta Store to Main Store, improving compression and read performance.
Q: How does data aging contribute to HANA performance tuning?
- A: Moves less-accessed data to cheaper storage, freeing up main memory.
Q: What is the primary tool for analyzing expensive SQL statements in HANA?
- A: SAP HANA Cockpit's Workload Analysis or M_EXPENSIVE_STATEMENTS view.
Q: What is the EXPLAIN PLAN used for?
- A: To visualize and analyze the execution plan of an SQL query.
Q: Why should you avoid implicit type conversions in SQL queries?
- A: They can prevent efficient pushdown and indexing.
Q: What is the benefit of defining Primary Keys on HANA tables?
- A: HANA automatically uses them for efficient internal indexing and uniqueness checks.
Q: What is "code pushdown" in HANA?
- A: Executing logic within HANA's database engine rather than the application server.
Q: Which parameter controls the frequency of automatic savepoints?
- A: savepoint_interval_s.
Q: What is a "Workload Class" in HANA?
- A: A mechanism to prioritize and limit resource consumption for specific workloads.
Q: What is "Admission Control" in HANA?
- A: A feature to queue or reject new requests during high system load to prevent overload.
Q: Should you create many secondary B-tree indexes on column-store tables?
- A: No, only sparingly and after careful analysis.
Q: What is the impact of excessive trace files on disk performance?
- A: They can fill up disk space and impact I/O, particularly if tracefile_maxbytes_disk is not set.
Q: What is the purpose of logbuffer_size?
- A: It defines the size of the in-memory buffer for transaction logs before writing to disk.
Q: How does hardware NUMA alignment affect HANA performance?
- A: It minimizes memory access latency across CPU sockets.
Q: Why are VARCHAR columns sometimes problematic for performance in HANA?
- A: Can lead to less efficient compression and sometimes require explicit indexing for specific filters.
Q: What is the OPTIMIZE COMPRESSION SQL command used for?
- A: To manually trigger the re-optimization of table compression.
Q: What are the two main types of table storage in HANA?
- A: Row store and Column store.
Q: Which type of storage is preferred for analytical data in HANA?
- A: Column store.
Q: Why is the vm.overcommit_memory=2 setting important on Linux for HANA?
- A: Prevents the OS from over-committing memory, reducing the risk of OOMs.
Q: What is the benefit of using the Result Cache for queries?
- A: Stores query results in memory to serve subsequent identical queries faster.
Q: What is the impact of a very long-running uncommitted transaction on log volumes?
- A: It can prevent log segments from being freed, causing log volume to fill up.
Q: How can M_SERVICE_MEMORY help in database tuning?
- A: It shows memory breakdown by HANA services, helping to identify memory-hungry components.
Q: What is a key performance guideline for SQLScript procedures regarding cursors?
- A: Avoid using cursors; prefer set-based operations.

5 Scenario-Based Hard Questions and Answers for Database Tuning in SAP HANA

Scenario: Your SAP HANA production system (single node, 1TB RAM, global_allocation_limit=85%) experiences intermittent, severe performance degradation, with applications becoming unresponsive for 1-2 minutes at a time. HANA Cockpit shows CPU spikes to 100% during these periods, but memory consumption remains stable and below the global_allocation_limit. M_EXPENSIVE_STATEMENTS reveals several complex analytical queries (Calculation Views) frequently running during the performance dips. OS top command during a dip shows hdbindexserver as the main CPU consumer.
- Q: What is the most likely root cause of the performance degradation, and what specific tuning steps would you take to address it?
- A:
  - Most Likely Root Cause: The scenario points to a CPU bottleneck caused by highly CPU-intensive analytical queries. Even with sufficient memory, complex calculations, large joins, or aggregations within Calculation Views can saturate available CPU cores, leading to overall system unresponsiveness. The intermittent nature suggests these queries are triggered periodically, possibly by users or batch jobs.
  - Tuning Steps:
    1. Workload Management (Immediate & Long-term):
      - Identify the specific problematic queries: Use M_EXPENSIVE_STATEMENTS to get the SQL of the queries and identify the users/applications running them.
      - Implement Workload Classes: Define a workload class for these analytical queries and assign them a STATEMENT_MAX_CPU_LIMIT and/or STATEMENT_MAX_THREAD_LIMIT. This will throttle their CPU consumption and prevent them from monopolizing resources.
      - Implement Admission Control: Configure queue_cpu_threshold and reject_cpu_threshold to queue or reject new requests when CPU hits critical levels, ensuring critical OLTP processes (if any) are not starved.
    2. Query Optimization (Long-term):
      - Analyze Explain Plans: Use EXPLAIN PLAN and Plan Visualizer on the problematic Calculation Views. Look for inefficient operators (e.g., full scans, Cartesian products, very large intermediate results).
      - Optimize Calculation Views:
        
        Ensure proper join cardinalities are set to enable join pruning.
        
        Push down filters and aggregations as early as possible in the view.
        
        Review for any unneeded joins or calculated columns.
        
        Consider using Result Cache for these analytical queries if their data changes infrequently.
      - Redesign if necessary: If a view is inherently too complex, consider breaking it down or redesigning its underlying logic.
    3. Hardware Sizing Review (Long-term): If the workload is consistently high and cannot be optimized further, a review of CPU sizing might be needed (e.g., adding more cores or a server with higher clock speed, considering scale-out).
Scenario: You have a critical SAP S/4HANA system running on HANA. Users complain about slow data loads from external systems into a transactional table SALES_ORDERS (column store, ~1 billion rows, frequently updated) during daily batch windows. The load jobs often fail with "Transaction Rollback" errors due to "Log Volume Full" issues. You check M_VOLUMES and confirm log_volumes hit 100% during these loads. Log backups are running every 5 minutes to a reliable backup destination.
- Q: What is the most likely reason for the log volume filling up despite frequent log backups, and what specific tuning actions would you propose?
- A:
  - Most Likely Reason: The most probable cause is a very large, long-running uncommitted transaction during the data load process. Even if log backups are successful, if a transaction remains open for an extended period, HANA cannot free up the log segments associated with that transaction, leading to log volume exhaustion. The SALES_ORDERS table being transactional and heavily updated during loads supports this. Another possibility, less likely given the prompt, is extremely high transaction throughput exceeding the log volume's I/O capacity or the log buffer size, but the "uncommitted transaction" is more common with batch loads.
  - Tuning Actions:
    1. Identify Long-Running Transactions (Immediate):
      - Use SELECT * FROM M_TRANSACTIONS; to identify transactions that have been open for an unusually long time, especially those related to the data load. Look at STATEMENT_ID, APPLICATION_USER, LAST_ACTIVE_TIME.
      - If a long-running DML (Data Manipulation Language) transaction (e.g., a large INSERT or UPDATE statement) is identified, investigate its source in the external system or data load job.
    2. Optimize Data Load Process (Long-term):
      - Break Down Large Transactions: Instead of one massive transaction, break the data load into smaller, more manageable transactions. Commit frequently. This allows log segments to be freed more often.
      - Array Inserts/Batch Updates: Ensure the data loading tool uses array inserts or batch updates (e.g., INSERT INTO ... VALUES (...), (...)) to minimize network round trips, but still commit periodically.
      - Optimize SQL for Load: Ensure the INSERT/UPDATE statements themselves are efficient (e.g., avoiding subqueries that re-execute).
    3. Log Volume Sizing & Buffer (Review):
      - Increase logbuffer_size: If transaction throughput is genuinely very high, increasing indexserver.ini -> [persistence] -> logbuffer_size (e.g., from 32MB to 64MB or 128MB) can help by allowing more log entries to accumulate in memory before being flushed to disk.
      - Increase Log Volume Size: As a last resort, and only after optimizing the transactions, consider increasing the physical size of the log volume to provide more buffer. This only delays the problem if transactions are the root cause.
    4. Monitor Delta Merge (Related): Ensure delta merges are healthy for SALES_ORDERS to prevent the Delta Store from growing excessively, which can also contribute to I/O pressure during loads.
Scenario: You observe that MEMORY_SIZE_IN_DELTA for several large column-store tables in your BW on HANA system (e.g., F-fact tables) is continuously growing, reaching 30-40% of the table's total memory. Although delta merges are configured to run automatically, M_DELTA_MERGE_STATISTICS shows many entries with STATUS = 'SUCCESS' but REASON = 'NOT_MERGED' and MERGE_TYPE = 'OPTIMIZE_COMPRESSION'. This leads to higher overall memory consumption and slower query performance on these tables.
- Q: Why are the OPTIMIZE_COMPRESSION merges not happening, causing Delta Store growth, and what specific steps would you take to fix this?
- A:
  - Reason for NOT_MERGED for OPTIMIZE_COMPRESSION: This typically indicates that HANA's automatic delta merge process is not triggering the compression optimization because its internal heuristics (based on factors like delta store size, compression ratio gain, and number of updates) determine that an OPTIMIZE_COMPRESSION merge is not yet beneficial or is being delayed. While STATUS = 'SUCCESS' means the check ran, REASON = 'NOT_MERGED' with that type means the condition for that specific merge type wasn't met. The key problem is the continuous growth, meaning the automatic triggers are insufficient.
  - Tuning Steps:
    1. Manually Trigger OPTIMIZE COMPRESSION:
      - Immediate Action: For the problematic tables, manually force the merge: ALTER TABLE <schema_name>.<table_name> MERGE DELTA OPTIMIZE COMPRESSION; This will immediately compress the delta store content.
    2. Review and Adjust Delta Merge Parameters:
      - indexserver.ini -> [columnstore] -> delta_merge_auto_interval: This defines how often the auto-merge checks run. If it's too high, consider reducing it.
      - indexserver.ini -> [columnstore] -> merge_compression_ratio_min_threshold: This is crucial. If the current compression ratio of the main store is already very good, HANA might decide that further OPTIMIZE_COMPRESSION merges won't yield significant benefits, even if the delta store is large. You might need to temporarily lower this threshold to encourage more frequent optimization merges, or set it to 0 for specific tables if consistent optimization is paramount.
      - indexserver.ini -> [columnstore] -> optimize_compression = yes: Ensure this is enabled (it should be by default).
    3. Analyze Data Insertion Patterns:
      - High Update/Insert Frequency: If data is constantly inserted/updated, the delta store might never get "stable" enough for the automatic merge to kick in. Consider whether data loading can be batched differently or if an application design change is needed.
      - Data Distribution: If the data being inserted is highly diverse (high cardinality), the compression benefits might be less than anticipated, influencing HANA's merge decision.
    4. Rebuild Statistics (If Relevant): Sometimes, outdated statistics can influence optimizer decisions. CALL "SYS"."_SYS_STATISTICS"."RECALCULATE_OPTIMIZER_STATISTICS" (NULL, '<schema>', '<table_name>');
Scenario: Your SAP HANA scale-out system (4 nodes) is experiencing slow query performance, particularly for analytical reports that involve large joins between two tables (TABLE_A and TABLE_B), both of which are partitioned using HASH partitioning on PRIMARY_KEY across all four nodes. EXPLAIN PLAN for these queries shows "Column Search" operations taking a long time, and you notice SQL: "join_redistribution" operations in the plan, indicating data movement between nodes during the join.
- Q: What is the core issue causing the slow join performance in this scale-out scenario, and what partitioning strategy change would you recommend to optimize it?
- A:
  - Core Issue: The slow join performance is primarily due to data redistribution during the join operation. When two tables are joined in a scale-out environment, if their partitioning keys are different or if they are partitioned on the same key but distributed differently across nodes, HANA has to move data between nodes to perform the join efficiently. join_redistribution is a strong indicator of this expensive cross-node data movement. HASH partitioning on PRIMARY_KEY does not guarantee the same distribution for TABLE_A and TABLE_B relative to each other, even if PRIMARY_KEY is a join column.
  - Partitioning Strategy Change:
    - Co-partitioning (HASH-HASH Partitioning):
      - Recommendation: Change the partitioning strategy for both TABLE_A and TABLE_B to HASH partition on the common join column(s). If the join is on TABLE_A.COL1 = TABLE_B.COL2, then both tables should be hash-partitioned on COL1 and COL2 respectively, using the same number of partitions and the same hash function to ensure that matching join values reside on the same physical node. This is often achieved by defining a HASH partition key on the join column(s) for both tables.
      - Example (Conceptual):
        SQL
        -- For TABLE_A (assuming COL1 is the join key) ALTER TABLE <schema>.TABLE_A PARTITION BY HASH (COL1) PARTITIONS 4; -- For TABLE_B (assuming COL2 is the join key, and COL1 and COL2 map to same values) ALTER TABLE <schema>.TABLE_B PARTITION BY HASH (COL2) PARTITIONS 4;
      - Rationale: Co-partitioning (or often termed "collocation" of data) ensures that rows that need to be joined are physically located on the same HANA node. This eliminates the need for expensive data redistribution (inter-node data transfer) during the join, significantly speeding up query execution by allowing the join to be processed locally on each node in parallel.
      - Caution: Repartitioning large tables is a resource-intensive operation and requires downtime or careful online repartitioning. Plan this carefully. Ensure the chosen join column for partitioning is frequently used in join conditions and has good cardinality.
Scenario: A newly implemented reporting application is experiencing very slow response times when querying a HANA Calculation View. The view aggregates data from a large fact table (FACT_TABLE) and joins it with several dimension tables. You've analyzed the EXPLAIN PLAN and see that filters applied in the reporting application (e.g., WHERE CALENDAR_YEAR = '2024' AND REGION = 'EMEA') are not being pushed down to the FACT_TABLE early in the execution plan, leading to full scans of the FACT_TABLE before filtering. The FACT_TABLE uses RANGE partitioning on CALENDAR_YEAR.
- Q: What are the two primary reasons why the filters might not be pushing down in this Calculation View scenario, and how would you address them?
- A:
  - Primary Reasons for Filter Pushdown Blocker:
    1. Incorrect Join Cardinality in Calculation View: If the join cardinality between the FACT_TABLE and its dimension tables within the Calculation View is incorrectly set (e.g., N:M when it should be N:1 or 1:1), the optimizer might be prevented from pushing down filters across the join boundaries, as it cannot guarantee the uniqueness of values after the join.
    2. Calculated Columns or Complex Logic in Projection: If the CALENDAR_YEAR or REGION columns are not direct projections from the underlying table but are derived using complex calculations, functions, or subqueries within the Calculation View's projection node before the join or filter, it can block pushdown. HANA's optimizer might not be able to evaluate these complex expressions early enough to apply filters.
    3. Attribute View/Analytic View Usage (Legacy): If the Calculation View is built on top of older Attribute or Analytic Views, or if there's complex logic in those underlying views that prevents pushdown, that could be a factor. (Assuming modern Calculation View direct build).
  - Actions to Address:
    1. Correct Join Cardinality (Most Common Fix):
      - Action: In the Calculation View editor, review the join properties between FACT_TABLE and its related dimension tables. For each join, ensure the correct cardinality is selected (e.g., if a fact record has one CALENDAR_YEAR, it's typically N:1 from fact to dimension).
      - Rationale: Correct cardinality hints the optimizer about the uniqueness of the join, allowing it to safely push down filters and aggregations across joins.
    2. Simplify Projections / Push Down Calculations:
      - Action: Examine how CALENDAR_YEAR and REGION are handled within the Calculation View. If they are calculated columns, try to simplify them or push the calculation logic further down to the source table level if possible (e.g., use direct column references or simpler expressions).
      - Rationale: Simpler projections or pushing calculations closer to the source tables allows the optimizer more flexibility to apply filters early, leveraging the table's partitioning and intrinsic indexing.
    3. Review Pruning Settings / Explicit Filter Pushdown:
      - Action: Ensure the Calculation View is designed to utilize pruning on join columns if applicable. In some complex scenarios, you might use hints or specific modeling techniques to force filter pushdown.
      - Rationale: Explicitly guiding the optimizer can ensure it leverages the table's inherent partitioning structure.
    4. Verify Partitioning Key Usage: Double-check that CALENDAR_YEAR is indeed the direct column used for RANGE partitioning on FACT_TABLE and that the incoming filter CALENDAR_YEAR = '2024' directly matches the partitioning scheme. (Though the scenario implies the partitioning exists, the non-pushdown is the issue).

Rakshit Ranjan Singh

Search This Blog