Written by Eugene Bobkov
Solutions Architect/Technical Consultant at Blue Crystal Solutions
This tech article is for those seeking to learn more about Oracle backups in Azure with a particular focus on Azure Blobfuse.
There are multiple options to backup database in Azure Cloud, such as:
1. Locally attached disks
Pro
- Fast, performance limited by disk and instance sizes
- Easy to manage and mount
- Easy to use for backups and restore
Con
- Expensive, no access tiers, charged for allocated space not used
- Has maximum size
- To expand it must be detached
2. Azure Files
Pro
- Charged for used space only
- Standard filesystem functional, can be mounted to Windows and Linux
- Auto extending based on required capacity
Con
- Still more expensive than blob storage
- Has maximum capacity for container, 5T unless large file share feature enabled and 100T with the feature or premium type where 100T is default
3. Blob container
Pro
- Charged for used space only
- Very cheap to store files, especially write-only ones like long term backups which will be kept in archive access tier
- Virtually unlimited capacity (5P for storage account)
Con
- OS tools cannot communicate with the storage
- Cannot be used directly for backup and restore without additional abstraction layers or tools (NFS v3 over blob storage, azcopy)
These can be utilised directly by backup scripts or, if relevant infrastructure exists – by backup software like Veritas NetBackup or Veeam, however additional licensing costs may occur.
However, the most common use case is backup of small number of databases with reasonable size (up to 1-2T) to inexpensive storage solution, where it will be kept for some period, for month or more and deleted after defined period.
The following methodology walks you through the usage of Azure Blob by mounting it to Linux using blobfuse driver as it provides filesystem-like access experience with all cost and management benefits of Azure Blob storage.
Azure BlobFuse installation
As the name suggests, this driver utilizes FUSE, a Linux user space file system framework. This allows the masking of actual storage media from OS kernel and provides an interface to mimic general filesystem and its system calls. The limitation of this approach is the inability to use it for any OS except Linux and for Windows we must use different solutions.
BlobFuse driver can be installed from GitHub repository, where it is developed by the Microsoft team. As a product, it has been around for some time and can be considered as quite stable and feature full.
To access the Azure Blob Storage account from VM, use a security key or managed identity, which can be specified in various ways. We would recommend using managed identity specified in BlobFuse configuration file.
Blob objects can be created in any available access tier (hot/cold/archive) and with any redundancy, however, GRS can cause significant costs for cross-regional transfer.
Please note, enabling immutable blob storage is currently not supported by BlobFuse due to writing process design.
Azure BlobFuse setup
A Critical part of setup is the identification of a suitable staging area. As Blobfuse only mimics general-purpose filesystem, actual writing and reading happens behind the scenes in the staging area. While RMAN process waits for the backup file to be written or read, the Blobfuse driver copies actual data from Azure Blob object to staging area. From here it will be presented to user’s process through file descriptors. It is highly recommended to use VM type with enough temporary storage as it is the preferred storage to be used as a Blobfuse staging area. IO Operates on temporary drive and is not counted towards IOPS and storage bandwidth thresholds of the VM instance. High read performances of the temporary drive are essential as during some phases Blobfuse will run a checksum to read the whole file. This operation can be quite resource intensive as databases backup files or dump files can be exceptionally large.
The stage filesystem should be large enough to accommodate expected data volumes. The requirement can be reduced by running backup in multiple streams, limited in size (for example by setting MAXSETSIZE = 32G or FILESPERSET = 1) and deleted from staging area as soon as write operation completed.
While discussing mounting procedure, it should be mentioned that there is a security limitation which is inherited from fuse level, standard OS permissions are not supported as such. By default, the filesystem is accessible only by root user, but by specifying option to allow other during mount, BlobFuse defaults the file/directory permissions to 0777 so other users can access any file or directory.
To mount the filesystem at OS start, the most convenient way is to use @reboot directive of crontab, for example:
@reboot ${HOME}/bin/blobfuse_orabackup.sh >> ${HOME}/log/blobfuse_orabackup.out 2>&1
For RMAN backup the following parameters and options were used in our environment
blobfuse /orabackup \ # staging area --tmp-path=/mnt/blobfuse_orabackup \ # configuration file --config-file=/etc/blobfuse_orabackup.cfg \ # seconds, for how long file kept in staging area after read/write operations complited --file-cache-timeout-in-seconds=10 \ # polling interval for eviction process which delete files from staging area --cache-poll-timeout-msec=100 \ # how many files deleted from staging area at any giving time --max-eviction=1 \ # seconds, for how long to cache file’s attributes -o attr_timeout=240 \ # seconds, for how long to cache list of files -o entry_timeout=240 \ # seconds, this options is recommended, but not described in documentation -o negative_timeout=120 \ # to allow other users access to mounted filesystem -o allow_other
If the filesystem is intended to be used for Oracle Data Pump exports – it is beneficial to set –file-cache-timeout-in-seconds to higher value to avoid situation when table spread across multiple dump files and the first file where it was started to be exported has to be re-read from Azure Blob as some blocks in it has to be updated after export for the table complete.
Un-mounting of the filesystems can be done by using fusermount command, for example
# fusermount -zu /orabackup
Considerations for Oracle RMAN backup to BlobFuse filesystem
If it is expected that the size of the backup file or files created at the same time is more than size of staging area – it’s recommended to reduce size of one file using options like MAXSETSIZE = 32G or SECTION_SIZE = 32G in case of big file tablespaces.
If CPU allocation is sufficient then multiple steams (for Enterprise Edition) and compression would be good options to implement
It would be beneficial to have structure based on, for example, database unique name and date. Multiple files in one directory can reduce performance of OS tools like ls command. Also, proper structure will help to implement Lifecycle Management like deletion and moving to archive access tier
Database restoration from BlobFuse filesystem is not different from restoration from general-purpose filesystem. Channels allocated during restore operation can be seen no moving while file transferred from Blob to staging filesystem
Crosscheck operation is not recommended and must be avoided. While RMAN reads only header of the backupset file – BlobFuse will transfer whole content of the file from Blob to staging filesystem. As result crosscheck operation can take very long time and may result in additional cost, especially when reading from Cold access tier. If Lifecycle management enabled for the storage account and backup files delete after defined period of time – crosscheck operation can be substituted with script which checks file’s existence on BlobFuse filesystem and, if it does not exist – run ‘uncatalog’ command
Backup files management using RMAN delete obsolete should be avoided for the same reason as crosscheck operation as it’s causing unnecessary transfers from blob storage. If Azure Lifecycle management cannot be implemented for some reason – it would be beneficial to use scripts which deletes non-required files and deletes records from RMAN catalog using ‘uncatalog’ command
As stated previously, BlobFuse filesystems intended to be used for RMAN backups should not be created on top of container with immutable
Considerations for Oracle EXPDP and IMPDP using BlobFuse filesystem
Similar to RMAN, if there are space limitation in staging area – usage of FILESIZE option should be considered. However, it should be noted that in this case parameter –file-cache-timeout-in-seconds should be set to higher value to allow time for updating of file where initial part of the table was started to be written. Without this modification, the file will be read from Azure Blob which takes time and delays like this will impact overall performance for export process
For IMPDP, if the size of dump files is more than staging area, it may be required to copy all dump files to local filesystem. During import IMPDP does not release completed dump files and leaves them open for reading, and as result BlobFuse cannot delete files from staging area and consumes the whole staging filesystem.
Avoiding unnecessary costs
To avoid unnecessary costs which could be the result of list operations, it is recommended to exclude blobfuse type of file system from mlocate updated re-indexing operation. It can be done by adding blobfuse to PRUNEFS directive in /etc/updatedb.conf configuration file.
If you need some guided help, or would like this set up within your organisation, just reach out.