A solid backup and recovery strategy is the cornerstone of any robust SAP system. It ensures business continuity and protects against data loss due to hardware failures, corruption, natural disasters, or human error. This section will detail backup and recovery for both traditional SAP Basis databases (like Oracle, SQL Server, MaxDB via DB13) and SAP HANA.
Detailed Notes for Backup and Recovery in SAP Basis (DB13)
For traditional databases managed by SAP Basis (primarily Oracle, but also SQL Server, MaxDB, DB2), the primary tool for scheduling and monitoring database tasks, including backups, is the DBA Planning Calendar (T-code DB13).
Core Concepts
- BR*Tools: SAP provides a set of command-line tools called BR*Tools (BRBACKUP, BRARCHIVE, BRRESTORE, BRCONNECT, BRSPACE, BRRECOVER) to manage Oracle databases in an SAP environment. DB13 acts as a GUI wrapper for these tools.
- Database Types:
- Online/Hot Backup: Database is up and running, users can continue working. Requires the database to be in ARCHIVELOG mode.
- Offline/Cold Backup: Database is shut down. Ensures data consistency but involves downtime.
- Log Backups (Archive Logs): Essential for point-in-time recovery for online backups. Records all committed transactions.
- Control File Backup: The control file is critical as it contains metadata about the database's physical structure (data files, log files, timestamps). It must be backed up regularly.
- Backup Medium: Can be disk, tape, or external backup solutions (e.g., via Backint interface).
DB13 Functionality
DB13 provides a calendar-based interface to schedule and execute:
- Database Backups: Full, incremental, differential (depending on DB type).
- Offline Redo Log Backups (Archive Log Backups): Crucial for online backups.
- Check Database: Performs consistency checks.
- Update Optimizer Statistics: Ensures database optimizer has up-to-date information for query execution.
- Cleanup Logs/Traces: Housekeeping tasks.
- Reorganize Tables/Indexes: Performance optimization.
Key Process for Backup (using DB13)
- Access DB13: Log into SAP GUI, enter T-code DB13.
- DBA Planning Calendar: You'll see a calendar. Green indicates successful tasks, red for failed, yellow for warnings.
- Schedule Action:
- Double-click on a specific date/time slot.
- Select "Schedule an action."
- Choose "Database backup" or "Archive log backup."
- Select backup type (e.g.,
full_online
,full_offline
). - Specify backup device type (e.g.,
disk
,tape
,tape_auto
). - Specify backup class (relevant for external backup tools).
- Choose "Start immediately" or specify a "Start time."
- Confirm.
- Monitoring:
- The scheduled job appears in the calendar.
- You can double-click the entry to view the job log (
SM37
background job log) and the detailed BR*Tools log (BRBACKUP
orBRARCHIVE
log). - Logs are typically found at OS level under
/oracle/<SID>/sapbackup
or similar directories.
Recovery Process (General Principles)
Recovery typically involves:
- Restore Data Files: Restore the last full data backup from the backup medium.
- Apply Delta/Incremental Backups: If applicable, apply subsequent delta/incremental backups.
- Apply Redo/Archive Logs: Apply all redo/archive logs generated since the last data backup until the desired point in time or the time of failure.
- Open Database: Open the database for normal operation.
<!-- end list -->
- Tools:
BRRESTORE
(for data file restore),BRRECOVER
(for full database recovery, combining restore and log application), native database tools. - Recovery Scenarios:
- Complete Recovery: Recover to the last possible consistent state (all committed transactions applied).
- Point-in-Time Recovery: Recover to a specific timestamp. Data committed after this point is lost. Requires
ARCHIVELOG
mode. - Incomplete Recovery: Recovering only parts of the database (e.g., lost data file).
Important Configurations (init<DBSID>.sap)
BR*Tools read their configuration from the init<DBSID>.sap
profile (e.g., initORA.sap
for Oracle). This file is located in $ORACLE_HOME/dbs
(Unix/Linux) or %ORACLE_HOME%\database
(Windows).
Key parameters in init<DBSID>.sap
for backup:
backup_type
: Specifies default backup type (e.g.,online
,offline
).backup_mode
: Specifies backup mode (e.g.,all
for all data files,full
for complete backup,partial
for specific tablespaces/datafiles).backup_dev_type
: Defines the backup medium (e.g.,disk
,tape
,tape_auto
,rman_disk
,rman_tape
,util_file
,util_file_online
).util_file
andutil_file_online
are for Backint.backup_root_dir
: The root directory for disk backups.compress
: Enables/disables backup compression.expir_period
: Tape retention period.tape_use_count
: Maximum times a tape can be written to.volume_backup
: List of tape volumes for data backups.volume_archive
: List of tape volumes for archive log backups.parallel_degree
: Degree of parallelism for backup.util_par_file
: Path to the parameter file for Backint interfaces (e.g.,init<BR_TOOL_NAME>.sap
).rman_channels
: Number of RMAN channels for parallel backup.
Note: Parameters specified in DB13's "Schedule Action" dialog (or command line for BR*Tools) override values in init<DBSID>.sap
.
Common Issues in DB13 Backups
- Permissions Issues:
ora<sid>
or<sid>adm
user lacking read/write permissions on backup directories orBR*Tools
executables. (Error:errno 13: Permission denied
). - Incorrect
init<DBSID>.sap
Parameters: Typo inbackup_dev_type
, wrong paths, etc. - Lack of Disk Space: For disk backups, target directory runs out of space.
- Tape Device Issues: Tape drive not accessible, tape full, wrong tape loaded, tape device not configured correctly.
- BR*Tools Not Updated: Mismatch between kernel version and BR*Tools patch level.
- Database Not in ARCHIVELOG Mode: Trying to perform an online backup when the database is not in the required mode.
SAP_COLLECTOR_FOR_PERFMONITOR
(RSSTAT80) job: This job also updates DB13 status. If it's failing, DB13 may show incorrect status.
Detailed Notes for Backup and Recovery in SAP HANA
SAP HANA's in-memory nature and unique architecture necessitate a specific backup and recovery strategy.
Core Concepts
- Persistence Layer: HANA's persistence layer ensures data durability by regularly saving in-memory data to disk (savepoints) and continuously writing all transactions to redo logs.
- Data Backups: Snapshots of the data volumes (including row and column store data) at a specific point in time.
- Log Backups: Backups of the redo logs, capturing all changes since the last log backup. Crucial for point-in-time recovery.
- Backup Catalog: A metadata repository that stores information about all data and log backups (location, timestamp, type, size). Essential for recovery.
- Multi-Tenant Database Containers (MDC): Each tenant database has its own isolated backup and recovery process, managed by the SystemDB. SystemDB also requires its own backups.
- Backint Interface: SAP HANA provides a certified Backint API that allows seamless integration with third-party enterprise backup solutions (e.g., Commvault, NetBackup, EMC Data Domain, IBM Spectrum Protect). This is the recommended approach for production environments.
- File System Backups: Backups written directly to a file system (local disk, NFS share). Suitable for smaller systems or testing.
Types of Backups in SAP HANA
-
Data Backups:
- Full Data Backup: A complete copy of all data in the database. Forms the basis for any recovery. Should be performed regularly (e.g., daily).
- Delta Backups: Capture changes since a previous data backup.
- Incremental Backup: Contains all changes since the last successful full or incremental data backup. Smallest delta backup size.
- Differential Backup: Contains all changes since the last successful full data backup. Size increases with time from the last full.
- Purpose: For recovering the database to a specific state.
-
Log Backups:
- Contain the redo log entries (all committed transactions) that occurred since the last log backup.
- Continuously generated and written to log volumes.
- Crucial for point-in-time recovery and bringing the database to its most recent state after a data backup restore.
log_mode
parameter inglobal.ini
determines how logs are handled (e.g.,normal
for continuous log backups,overwrite
for development where logs are overwritten). For production,normal
is mandatory.
-
Catalog Backup:
- A backup of the backup catalog itself.
- In an MDC setup, the SystemDB's catalog backup is vital for recovering the entire system, including tenant databases.
- Usually part of the data backup or handled implicitly by Backint.
Backup Methods in SAP HANA
-
SAP HANA Cockpit:
- Provides a web-based GUI for scheduling and monitoring backups.
- Recommended for modern HANA administration.
- Offers granular control over backup types, destinations, and schedules for SystemDB and tenant databases.
-
SAP HANA Studio (Legacy):
- Deprecated for most administrative tasks, but still provides backup capabilities for older HANA versions.
-
SQL Commands:
BACKUP DATA USING FILE '<path>/<filename>'
: For file system data backups.BACKUP DATA USING BACKINT ('<external_backup_id>')
: For Backint data backups.BACKUP LOG ALL USING FILE '<path>/<filename>'
: For manual log backups to file system.BACKUP LOG ALL USING BACKINT ('<external_backup_id>')
: For manual log backups to Backint.
-
hdbsql
Command-Line Tool: Can execute SQL backup commands.
Recovery Process in SAP HANA
Recovery is initiated via SAP HANA Cockpit (Recommended), SAP HANA Studio (Legacy), or hdbbackupdiag
/hdbbackupcheck
command-line tools.
-
Choose Recovery Type:
- Recover to most recent state: Recovers to the latest possible point, requiring the latest data backup and all subsequent log backups.
- Recover to specific point in time: Recovers to a specified timestamp, requiring a data backup from before that time and all subsequent log backups up to that time.
- Recover to specific data backup: Recovers to the state of a specific data backup, discarding all changes made after that backup.
- Recover to specific log position: Advanced option for specific scenarios.
-
Locate Backups: The system uses the backup catalog to find the necessary data and log backups. If the catalog is lost, it needs to be restored first.
-
Start Recovery: The HANA system will automatically identify the required backups, restore them, and apply the redo logs to reach the target recovery point.
Important Configuration Parameters (global.ini, etc.)
global.ini
:[persistence]
:basepath_datavolumes
: Location of data volumes (where HANA's active data resides).basepath_logvolumes
: Location of log volumes (where HANA's active redo logs are written).log_mode
:normal
(for production, continuous log backups) oroverwrite
(for non-prod/dev, logs overwritten).log_backup_timeout
: How long HANA waits before forcing a log backup if no new transaction occurs.
[backup]
:catalog_backup_path
: Location for backup catalog backups (if not using Backint).enable_auto_log_backup
:true
(default and recommended) for continuous log backups.log_backup_interval_s
: Frequency of automatic log backups (seconds).data_backup_file_path
: Default path for file-based data backups.log_backup_file_path
: Default path for file-based log backups.parallel_data_backup_channels
: Number of parallel channels for data backup.check_integrity
: Enables integrity checks during backup.data_backup_backint_parameter_file
: Path to the Backint parameter file for data backups.log_backup_backint_parameter_file
: Path to the Backint parameter file for log backups.catalog_backup_backint_parameter_file
: Path to the Backint parameter file for catalog backups.file_backup_buffers
: Number of internal buffers for file-based backups.file_backup_buffer_size
: Size of each buffer for file-based backups.
- Backint Parameter File (e.g.,
backint.conf
): This file is crucial for Backint integration. It contains parameters specific to the third-party backup tool, such as:EXECUTABLE
: Path to the Backint executable provided by the backup vendor.PARAMETER_FILE
: Path to the vendor-specific parameter file.LOG_FILE
: Path to the Backint log file.COMPRESSION_LEVEL
,ENCRYPTION
, etc. (vendor-specific)
Common Issues in SAP HANA Backups
- Insufficient Disk Space: For file-based backups, the target backup directory runs out of space.
- Backint Configuration Issues: Incorrect paths in the Backint parameter file, incorrect permissions for the Backint executable, network connectivity issues to the backup server.
- Permissions Issues:
<sid>adm
user lacking read/write permissions to backup directories or Backint executable. - Network Connectivity: Issues between HANA server and the backup target (NFS share, backup server).
- Log Mode (
log_mode=overwrite
): If set to overwrite in a production system, point-in-time recovery is impossible. - Backup Catalog Corruption: Can prevent successful recovery. Regular catalog backups or ensuring Backint manages the catalog is critical.
- Tenant Database Backup Issues: In MDC, SystemDB backup might succeed, but a tenant backup fails (e.g., due to specific tenant configuration or user permissions).
- Missing Log Backups: If
enable_auto_log_backup
isfalse
or log backups fail, recovery might be limited. - Password Expiry: The database user used for backup (
SYSTEM
or a dedicated backup user) might have an expired password.
30 Interview Questions and Answers (One-Liner) for Backup and Recovery (DB13 & HANA)
- Q: What is the T-code for the DBA Planning Calendar in SAP Basis?
- A: DB13.
- Q: What is the primary purpose of DB13?
- A: To schedule and monitor database administration tasks, including backups.
- Q: Which set of tools does DB13 internally call for Oracle database operations?
- A: BR*Tools (BRBACKUP, BRARCHIVE, BRRESTORE, etc.).
- Q: What is the difference between an online and an offline database backup?
- A: Online backup is done while DB is running; offline requires DB shutdown.
- Q: What mode must an Oracle database be in for online backups?
- A: ARCHIVELOG mode.
- Q: What are archive logs essential for in Oracle recovery?
- A: Point-in-time recovery.
- Q: Which BR*Tool is used for restoring data files?
- A: BRRESTORE.
- Q: Where is the
init<DBSID>.sap
file typically located?- A:
$ORACLE_HOME/dbs
(Unix/Linux) or%ORACLE_HOME%\database
(Windows).
- A:
- Q: Which parameter in
init<DBSID>.sap
specifies the backup device type?- A:
backup_dev_type
.
- A:
- Q: What does
util_file
orutil_file_online
forbackup_dev_type
signify?- A: Integration with a third-party backup tool via Backint.
- Q: What is the main purpose of the SAP HANA persistence layer?
- A: To ensure data durability and atomicity.
- Q: What are savepoints in SAP HANA?
- A: Periodic consistent snapshots of in-memory data written to disk.
- Q: What are the two main types of backups in SAP HANA?
- A: Data Backups and Log Backups.
- Q: What do HANA log backups contain?
- A: All committed transaction entries (redo logs).
- Q: Which
global.ini
parameter is crucial for enabling continuous log backups in HANA?- A:
log_mode = normal
.
- A:
- Q: What is the purpose of the SAP HANA backup catalog?
- A: Stores metadata about all data and log backups.
- Q: What is a "Full Data Backup" in SAP HANA?
- A: A complete copy of all data in the database at a specific point in time.
- Q: Name two types of "Delta Backups" in SAP HANA.*
- A: Incremental and Differential.
- Q: Which SAP HANA component manages isolated backups for tenant databases?
- A: The System Database (SystemDB).
- Q: What is Backint in the context of SAP HANA backups?
- A: An API for integration with third-party enterprise backup solutions.
- Q: What is the recommended GUI tool for managing HANA backups in modern systems?
- A: SAP HANA Cockpit.
- Q: If a DB13 backup fails with "Permission denied," what is the first thing to check?
- A: OS-level file permissions for BR*Tools and backup directories.
- Q: What is the impact of
log_mode = overwrite
in a production HANA system?- A: Point-in-time recovery is not possible.
- Q: What is the
global_allocation_limit
parameter in HANA used for?- A: Limits the total memory consumption of all HANA processes.
- Q: What is a "Point-in-Time Recovery" (PITR)?
- A: Recovering a database to a specific timestamp.
- Q: Which SAP HANA service is responsible for managing services and distributed landscapes?
- A: Name Server.
- Q: What is the purpose of the
util_par_file
parameter ininit<DBSID>.sap
?- A: Specifies the parameter file for the Backint interface.
- Q: What does a successful DB13 entry look like in the calendar?
- A: Green.
- Q: What happens if the HANA backup catalog is corrupted?
- A: Recovery might be difficult or impossible without a separate catalog backup.
- Q: Can you perform an online backup of a traditional SAP database if it's not in ARCHIVELOG mode?
- A: No.
15 Scenario-Based Hard Questions and Answers for Backup and Recovery (DB13 & HANA)
SAP Basis (DB13) Scenarios
-
Scenario: Your daily Oracle online database backup (scheduled via DB13) has started failing consistently. The DB13 job log shows "BR0280I BRBACKUP time stamp..." followed by "BR0252E Function fopen() failed for '/oracle/SID/sapbackup/xyz.log' at location main-9. BR0253E errno 13: Permission denied."
- Q: What is the most likely cause, and what immediate action would you take?
- A:
- Most Likely Cause: The OS user running the BRBACKUP process (typically
ora<sid>
or<sid>adm
) lacks write permissions to the/oracle/SID/sapbackup
directory, or thexyz.log
file itself has incorrect permissions. This often happens after manual file operations or changes to user/group memberships. - Immediate Action:
- Log in to the OS as
ora<sid>
or<sid>adm
. - Navigate to
/oracle/SID/sapbackup
. - Check file/directory permissions using
ls -l
(Linux/Unix) ordir /q
(Windows). - Adjust permissions using
chmod
(e.g.,chmod 775 /oracle/SID/sapbackup
) andchown
(e.g.,chown ora<sid>:sapsys /oracle/SID/sapbackup
). Ensure the directory is owned byora<sid>
and groupsapsys
. - Retest the backup.
- Log in to the OS as
- Most Likely Cause: The OS user running the BRBACKUP process (typically
-
Scenario: You need to perform an urgent point-in-time recovery for an Oracle database on your SAP system because of accidental data deletion at 10:00 AM. Your last full online backup was at 00:00 AM today, and log backups run every 15 minutes.
- Q: What is the high-level recovery strategy, and what critical prerequisite must be met for this type of recovery?
- A:
- High-level Recovery Strategy:
- Shut down the SAP system and the Oracle database.
- Restore the full online data backup from 00:00 AM.
- Apply all subsequent archive (redo) logs from 00:00 AM until just before 10:00 AM (the point of failure).
- Open the database in
RESETLOGS
mode. - Restart the SAP system.
- Critical Prerequisite: The Oracle database must be running in ARCHIVELOG mode. Without it, only a complete recovery to the last full backup is possible, not a point-in-time recovery.
- High-level Recovery Strategy:
-
Scenario: Your nightly BRBACKUP to tape fails with an error indicating "Tape device not found" or "no available tape volumes." You've confirmed the tape library is powered on.
- Q: What are the two most likely causes from an SAP Basis perspective, and how would you investigate?
- A:
- Likely Causes:
- Incorrect
init<DBSID>.sap
configuration: Thetape_address
orbackup_dev_type
parameters might be misconfigured, pointing to a non-existent or incorrect tape device path. Or,volume_backup
might be empty or list incorrect tape names. - OS-level tape device issues: The operating system itself might not be recognizing the tape device, or the device files (
/dev/rmt/0m
, etc.) might be missing or have incorrect permissions.
- Incorrect
- Investigation:
- Verify
init<DBSID>.sap
: Check thetape_address
andvolume_backup
parameters for accuracy. - OS-level Tape Check:
- Use OS commands like
mt -f /dev/rmt/0 status
(Linux/Unix) or check Device Manager (Windows) to see if the tape device is detected and operational. - Check permissions on the tape device file.
- Run a native OS tape utility test (e.g.,
tar cvf /dev/rmt/0 /tmp/testfile
). - If using a third-party backup software, check its logs and configurations for tape device recognition.
- Use OS commands like
- Verify
- Likely Causes:
-
Scenario: You have just completed an SAP system copy using the backup/restore method. After the restore, the SAP system starts, but users report frequent "ORA-01033: ORACLE initialization or shutdown in progress" errors, even though the database appears to be open.
- Q: What is the most probable database state issue, and how would you resolve it?
- A:
- Most Probable Issue: The database was probably opened in
MOUNT
state, but not fullyOPEN
, or it's still in aRECOVER
state. This usually means the recovery process wasn't completed successfully, or theALTER DATABASE OPEN
command wasn't executed or completed. - Resolution:
- Log in as
ora<sid>
user to the OS. - Connect to SQLPlus:
sqlplus / as sysdba
. - Check database status:
SELECT STATUS FROM V$INSTANCE;
orSELECT OPEN_MODE FROM V$DATABASE;
. It should beOPEN
orREAD WRITE
. - If not
OPEN
, try:RECOVER DATABASE UNTIL CANCEL;
(if logs are still missing) orALTER DATABASE OPEN RESETLOGS;
(after full recovery with log application).RESETLOGS
is crucial after an incomplete recovery or recovery from a full backup with applied logs. - Monitor
alert.log
for any errors during startup.
- Log in as
- Most Probable Issue: The database was probably opened in
-
Scenario: You notice that your DB13 "Update Optimizer Statistics" jobs are taking an unusually long time and sometimes fail. Your database is Oracle.
- Q: What are two potential causes for this performance degradation of statistics updates, and what actions might you consider?
- A:
- Potential Causes:
- Excessive Data Growth: The amount of data in tables has grown significantly, making statistics collection a longer process.
- Inadequate Resources: The database server itself might be experiencing resource contention (CPU, I/O) during the statistics collection, especially if other heavy processes are running concurrently.
- Outdated BRCONNECT/Oracle patches: Bugs or inefficiencies in older versions of BRCONNECT or Oracle optimizer can cause slowness.
- Actions:
- Schedule during off-peak hours: Ensure the statistics update job runs when system load is minimal.
- Adjust
init<DBSID>.sap
parameters:stats_sample_size
: Reduce the sampling percentage if 100% is not strictly necessary for large tables.parallel_degree
: Increase parallelism for statistics collection if resources allow.
- Investigate OS/DB Resources (ST06/ST04): Monitor CPU, I/O, and memory during the job run to identify bottlenecks.
- Consider Partial Statistics Update: Instead of updating all statistics, focus on critical, frequently changing tables.
- Update BR*Tools and Oracle Patches: Ensure you are on the latest recommended patch levels.
- Potential Causes:
SAP HANA Scenarios
-
Scenario: Your SAP HANA system has crashed. When attempting to perform a recovery via HANA Cockpit, you cannot see any available data backups in the catalog, even though you know daily backups were running.
- Q: What is the likely problem, and what is your approach to initiating recovery?
- A:
- Likely Problem: The SAP HANA backup catalog itself (which contains metadata about the backups) is corrupted or inaccessible. This often happens if the system database's data volumes were also affected, or if the
catalog_backup_path
was pointing to a location that is no longer available. - Approach to Initiating Recovery:
- Manual Catalog Restore (if applicable): If you had a separate catalog backup (e.g., to file system), you might need to manually restore it first using
hdbbackupcheck --checkCatalog
. - Recovery Without Catalog (Advanced): If no catalog is available, you will need to manually specify the location of your full data backup using the
hdbbackupdiag --dataOnly
command or theRECOVER DATABASE
SQL statement, pointing to the backup location (file system or Backint). HANA will then try to re-create the catalog from the restored data backup and subsequent log backups. This is more complex and error-prone. - Backint Integration Check: If using Backint, ensure the Backint configuration parameters are correct and the external backup tool's server is reachable and has the backup metadata.
- Manual Catalog Restore (if applicable): If you had a separate catalog backup (e.g., to file system), you might need to manually restore it first using
- Likely Problem: The SAP HANA backup catalog itself (which contains metadata about the backups) is corrupted or inaccessible. This often happens if the system database's data volumes were also affected, or if the
-
Scenario: Daily full data backups of your HANA tenant database are taking excessively long, impacting system performance during the backup window. You are using Backint for backups.
- Q: What are two key areas to investigate for optimizing HANA data backup performance using Backint?
- A:
- Areas to Investigate:
- Parallelism:
- HANA Side: Check
parallel_data_backup_channels
parameter inglobal.ini
([backup]
section). Increase it if the backup target and network can handle more parallel streams. - Backint Side: Check the configuration of the third-party backup software for the number of streams/channels it can support. Ensure it's configured to match HANA's parallelism.
- HANA Side: Check
- Network Bandwidth/Latency to Backup Target:
- Verify network throughput and latency between the HANA server and the Backint media agent/backup server.
- Check if network congestion is occurring during the backup window.
- Consider increasing network bandwidth or optimizing network paths.
- Backup Target Performance:
- The performance of the backup storage (disk or tape system) itself is critical. Is it a fast SAN, local SSDs, or a slow NAS?
- Check I/O performance on the backup target.
- HANA Data Compression: Data compression within HANA (column store) can reduce backup size and time. Ensure optimal compression is configured.
- Parallelism:
- Areas to Investigate:
-
Scenario: You attempt to perform a point-in-time recovery on a SAP HANA tenant database, but the recovery fails, indicating "missing log backups." You check
global.ini
and confirmlog_mode=normal
andenable_auto_log_backup=true
.- Q: What are two common reasons for "missing log backups" in this scenario, and how would you troubleshoot?
- A:
- Common Reasons:
- Log Backup Failures: Automated log backups might have been failing silently or intermittently (e.g., due to temporary network issues, insufficient space on the log backup destination, or Backint issues).
- Log Backup Destination Inaccessibility: The configured log backup destination (file system or Backint) became unavailable or ran out of space after some log backups succeeded, leading to a gap.
- Manual Intervention/Wrong
log_mode
History: Whilelog_mode=normal
now, it might have been temporarily changed tooverwrite
or log backups were disabled at some point, causing a break in the log chain.
- Troubleshooting:
- Check Log Backup History: Use HANA Cockpit (or SQL
BACKUP.BACKUP_CATALOG
view) to review the log backup history. Look for gaps or failed log backups. - Check Log Backup Destination: Verify the accessibility and free space of the
log_backup_file_path
or the Backint target. - Check
hdblogbackup.py
Logs (if applicable): If custom scripts are used, check their logs. - Backint Logs: If using Backint, check the specific log files of the third-party backup software for errors related to log backups.
- HANA Trace Files: Examine
indexserver
andnameserver
trace files for any errors related to log archiving or backup.
- Check Log Backup History: Use HANA Cockpit (or SQL
- Common Reasons:
-
Scenario: Your SAP HANA system has been upgraded to a new SPS. Post-upgrade, you observe that the automatic log backups are no longer being written to the configured Backint destination, but are accumulating in the local
log_backup
directory.- Q: What is the most probable cause, and how would you rectify it?
- A:
- Most Probable Cause: The upgrade process might have reset or modified the
global.ini
parameters related to Backint integration, or the new kernel/HANA revision requires updated Backint executables or configurations. Specifically, thelog_backup_backint_parameter_file
ordata_backup_backint_parameter_file
might be incorrect or thelog_backup_destination_type
might have reverted tofile
. - Rectification:
- Verify
global.ini
Parameters: Using HANA Cockpit orALTER SYSTEM ALTER CONFIGURATION
, check the[backup]
section, especially:log_backup_destination_type
: Ensure it's set tobackint
.log_backup_backint_parameter_file
: Verify the correct path to your Backint parameter file.data_backup_backint_parameter_file
: Also verify this for data backups.
- Check Backint Executable/Permissions: Ensure the Backint executable provided by the third-party vendor is compatible with the new HANA SPS and has correct OS-level permissions (
<sid>adm
). - Restart Services: After parameter changes, a restart of the index server or potentially the HANA instance might be required for changes to take effect.
- Consult SAP Notes/Vendor Documentation: Check SAP Notes related to the new SPS for any specific backup configuration changes or requirements, and consult your backup vendor's documentation for their certified Backint version with the new SPS.
- Verify
- Most Probable Cause: The upgrade process might have reset or modified the
-
Scenario: You need to recover an SAP HANA database to a specific point in time, but the only full data backup available is from two weeks ago. All log backups since then are available.
- Q: Is this recovery scenario feasible, and what are the implications regarding recovery time and resources?
- A:
- Feasibility: Yes, this scenario is feasible, assuming
log_mode=normal
was maintained throughout the two weeks and all log backups are available and intact. - Implications:
- Increased Recovery Time: The primary implication is a significantly longer recovery time. HANA will first restore the two-week-old full data backup, and then it will have to apply all redo logs (two weeks' worth) to reach the desired point in time. This log application phase is CPU and I/O intensive.
- Resource Consumption: This process will consume substantial CPU, memory, and I/O resources on the HANA server. The time taken will depend heavily on the transaction volume (amount of log data), the number of log backups, and the server's specifications.
- Disk Space: Ensure sufficient disk space is available for the restored data volumes and for processing the large number of log backups.
- Increased Risk: A longer recovery window inherently increases the risk of further issues or interruptions during the recovery process. This highlights the importance of frequent full or delta data backups to minimize the log application phase.
- Feasibility: Yes, this scenario is feasible, assuming
-
Scenario: A new SAP HANA MDC system has just been implemented. You are responsible for setting up the backup strategy.
- Q: What are the two distinct types of databases you need to consider for backup in an MDC setup, and why is backing up the SystemDB crucial?
- A:
- Two Distinct Database Types:
- System Database (SystemDB): Manages the entire MDC landscape, including tenant databases.
- Tenant Databases: The actual databases where application data resides.
- Why SystemDB Backup is Crucial:
- Central Control: The SystemDB contains the central backup catalog and metadata for all tenant databases.
- Recovery Dependency: Without a healthy SystemDB backup (especially its catalog), recovering individual tenant databases might be impossible, as the SystemDB holds the blueprint of the entire multi-tenant system and points to where tenant backups are located.
- Disaster Recovery: In a full system disaster recovery, the SystemDB must be recovered first to restore the MDC landscape before tenant databases can be recovered.
- Two Distinct Database Types:
-
Scenario: Your HANA database is showing high
log_fill_ratio
andlog_segment_count
values, and you're getting alerts about the log area being full. However,log_mode=normal
and automatic log backups are enabled.- Q: What is the primary cause of this issue, and what immediate and long-term actions would you take?
- A:
- Primary Cause: Automatic log backups are not occurring or are failing, causing the log segments in the active log area to accumulate without being backed up and freed.
- Immediate Action:
- Check Log Backup Status: Use HANA Cockpit (or SQL) to check the status of recent log backups. Are they failing?
- Check Log Backup Destination: Verify accessibility and free space of the
log_backup_file_path
or the Backint target for logs. - Manually Trigger Log Backup: If possible, execute a manual log backup (
BACKUP LOG ALL
) to free up log segments. This might resolve the immediate crisis. - Check for Blocking Transactions: Extremely long-running or uncommitted transactions can also keep log segments occupied. Check for long-running transactions (e.g., using
M_TRANSACTIONS
view).
- Long-Term Action:
- Root Cause Analysis for Failed Logs: Investigate the specific reason for log backup failures (e.g., Backint issues, network, permissions, target storage problems).
- Monitoring: Implement robust monitoring for log backup success/failure and log area fill ratio.
- Tuning
log_backup_interval_s
: Whilenormal
mode automatically triggers, ensuring the interval is not excessively long for your transaction volume can help manage log area growth.
-
Scenario: You are trying to perform a full system recovery of a distributed SAP HANA system (scale-out). After restoring the first host's data, the recovery process hangs when trying to bring up the services.
- Q: What critical component needs to be recovered or started first in a distributed HANA recovery, and why?
- A:
- Critical Component: The Name Server on the master host.
- Why: In a distributed HANA system, the Name Server maintains the topology of the entire landscape, including which services run on which hosts and the location of data partitions. If the Name Server on the master host is not healthy or recovered first, the other services cannot properly start and integrate into the distributed system, leading to recovery failures or hangs. You often need to restore the Name Server's persistence (which is part of the overall data backup) and ensure it starts successfully before proceeding with other Index Servers.
-
Scenario: A new compliance requirement dictates that all backup files must be encrypted. Your current SAP HANA backup solution does not include native encryption for file-based backups.
- Q: What are two methods to achieve encryption for your SAP HANA backups, and what are their pros and cons?
- A:
- Method 1: Operating System (OS) Level Encryption:
- Pros: Can be implemented immediately with existing file-based backups. Does not require changes to HANA configuration.
- Cons: Performance overhead on the OS level during backup. Key management is handled by the OS/system administrator, potentially outside SAP. Recovery to a different system (e.g., disaster recovery site) requires the decryption capabilities of the OS on the target system.
- Method 2: Third-Party Backup Solution with Backint Integration:
- Pros: Recommended and robust. Most enterprise backup solutions (integrated via Backint) offer native encryption capabilities (at the media agent or target storage level). This offloads encryption overhead from HANA and the OS. Centralized key management within the backup solution.
- Cons: Requires investment in a certified third-party backup solution. Implementation and configuration can be more complex than simple OS-level encryption.
- Method 3 (Less Common): SAP HANA Data Volume Encryption:
- Pros: Encrypts data at rest within HANA's data volumes. Backups of encrypted data volumes will naturally be encrypted.
- Cons: Primarily protects data on disk, not directly the backup stream. Can have some performance implications on HANA itself. Requires careful key management within HANA (SSFS, external KMS).
- Method 1: Operating System (OS) Level Encryption:
-
Scenario: You need to perform a full database recovery of your SAP Oracle database, but your last successful full backup was an offline backup from a week ago. All online log backups since then are available.
- Q: Can you still perform a point-in-time recovery to a specific timestamp within the last week, or are you limited to recovering to the state of the offline backup? Explain why.
- A:
- Yes, you can still perform a point-in-time recovery to a specific timestamp within the last week.
- Explanation: An offline backup, by its nature, is a consistent snapshot of the database at the time it was taken. When you restore an offline backup, you restore the data files to that specific consistent state. Since your database has been in
ARCHIVELOG
mode (implied by the existence of online log backups), all transactions after that offline backup are recorded in the archive logs. Therefore, you can apply these archive logs, up to your desired point in time, to roll forward the database from the state of the offline backup to any specific moment within the last week where archive logs are available. The recovery process would involve restoring the offline backup and then applying the archive logs to the target point in time, just as you would with an online backup.
Comments
Post a Comment