ControlPoint disk layout
The purpose of this brief knowledge document is to help understand the disk constraints with ControlPoint and propose layout and distribution recommendations. This concerns the database files for the MetaStore and IDOL that are disk intensive.
Here is a simplified schema of the components involved in the ingestion of files by ControlPoint.
General considerations on Disk Storage
It is important to understand that the disk layout has a strong impact on performance of the overall system. Disk accesses are going through multiple layers. Each of these layers have their own limits.
- Files and read/write streams: Applications read and write to files, even for a database. The File system manager has a limit in terms of information that can be written to one single file. Once this limit is reached, splitting the load in multiple files will increase the global read/write speed.
- File system manager: between files and virtual disks volume exposed by the storage system, there is a file system manager middle-ware. This low-level software organizes data in blocks, caches data and send it to the disk layer (storage adapter). NTFS is the one that is used by Windows Operating System. It has its own limits too. It can only read and write a certain number of parallel files at one time. If this is the case, splitting files on multiple volumes will increase performances of the system.
- Hardware disks: it has physical limitation and can only read and write a certain amount of data per seconds. Parallelizing input and outputs on multiple disks will help increase performances. Storage systems can create virtual volumes made of multiple disks and distributing the block across all of them. This can be a RAID 5 or even better for a database, a RAID 0+1.
- Storage controllers, networks and adapters: these are in between the file system manager and the physical disks. Controllers gets the input / output requests from the server via the storage network and passes it to disks. The network layout and storage box configuration should take into consideration distributing connections to multiple controllers, networking switches and server storage network adapters.
Operating System, executables and log files
On each servers of the architecture, there is an operating system, Windows. It is commonly installed on the C: drive. This is accessed by all Windows applications and functions.
ControlPoint will come with its program binaries. The default installation path is on the C: drive, in Program Files\Micro Focus\ControlPoint. This is specified during the installation of the components, like the creation of the IDOL installation package. This can be installed on a separate disk, although the performance impact is not that important. It can be put separate if we want to isolate CP log files. If the log level is set to high value, the writes can be intensive.
Note that each IDOL component has a parameter to define the log directory in its configuration file (.cfg in the folder). File System Connector example:
SQL Server database
SQL Server has files containing the database data and transaction logs.
The first recommendation would be to not put these on the same disk as the Operating System (C:) nor the SQL Server application binaries. These must be on separate volumes in order to get the full bandwidth of the disks.
ControlPoint installation process requires five (5) databases to be created in SQL Server. These are not equal in terms of requirements.
- Metastore Tags
One recommendation would be to isolate the MetaStore database from the other ones. This database will be strongly accessed during the ingestion process, especially during full scans. This database can be additionally separated in the Data, Index and Text on one side and the Logs on the other side. Finally, always use SQL Server 2016+ partition feature in order to split the MetaStore database in multiple files (the installation tool automatically creates one file per CPU core available on the SQL Server machine).
Example of SQL Server disk layout:
File paths for databases are specified with the Database Configuration tool from CP install:
IDOL Content Database
IDOL Content is the IDOL index database that stores and make available the index.
The first recommendation would be to not put these on the same disk as the Operating System (C:) nor the IDOL Content application binaries. These must be on separate volumes in order to get the full bandwidth of the disks.
We should also make care of separating Content databases on their own disks: each Content on a separate volume. This would help distributing the load on multiple file systems.
Example of IDOL Content disk layout:
IDOL Content disk path is specified when creating the IDOL installation package with the IDOL Deploy Tool:
Click on the first “Config...” button...