Understanding Sentinel disk usage
Everyone knows that Sentinel is a great product, but with a great product comes great disk usage (even with secondary configured).
This article is meant to be a guide only to how the disk is used and some recommendations that may assist in managing this disk usage. The sizes below are based on our single production server that monitors AD and eDir, so is not HA or large scale enterprise.
First off, lets discuss the basic allocation:
/etc/opt/novell/sentinel/stores basic configuration files about the system and doesn't tend to be very big at all in size. (~800K)
/opt/novell/sentinel/stores the binaries that run the system and is an average size for its content. (750M)
/var/opt/novell/sentinel/stores the actual data, both event and database, and is where the majority of the disk disappears to.
/var/opt/novell/sentinel/tmp/stores an exploded version of esec and jetty while running and Memory Dumps. This can grow significantly, especially if you have large Maps or Utilise IP2Location. (~20G+)
When we built our appliance, we took complete control over the disk and forcefully partitioned it in the manner outlined below. LVM was preferred to give us control over partition sizes, allowing increasing and decreasing of the disk allocation as required.
system LVM was split as such:
We only had 600G allocated for
sdb, so the
sentinel LVM was split as such:
Let me explain these partitions in a little more detail and why they were split in this way.
3rdparty: This contains the PostgreSQL database, MongoDB database, Jetty servlet engine, ActiveMQ, and Elasticsearch.
The MongoDB holds the alert data and analytics data and will preallocate 3GB without any data initially. The growth size of this database will depend on your Alert retention configuration, but can grow into the tens of Gigs pretty quick. There is a compact mongodb utility that can reclaim disk space if you choose to decrease the retention time and need the disk returned (it won't do this automatically).
The PostgreSQL holds basically everything else. The size of this will depends on what Support Packs you have installed, what and how Report Tables (RDD) are configured, and the retention time of event data. We utilise Identity Tracking and found the "Recent Activity" report table (which incidentally only holds 14 days) bloated to over 200GB pretty quickly (Bug 992954). While the sentinel process will run an autovacuum every now and then, sometimes a FULL vacuum is required to regain disk.
However, you should be warned that the free disk required to perform this must be of at least the size of the database/table you're intending to VACUUM. So, our evt_rpt_35003882 table that was ~218GB required 250GB of free disk and took significant time to run, but the result was that the table shrunk by about 60GB to 135GB.
novell@xxx:~> export ESEC_HOME=/opt/novell/sentinel novell@xxx:~> export JAVA_HOME=$ESEC_HOME/jre novell@xxx:~> APP_HOME="/opt/novell/sentinel" novell@xxx:~> export PATH="$APP_HOME/bin:$APP_HOME/bin/actions:$JAVA_HOME/bin:$PATH" novell@xxx:~> cd /opt/novell/sentinel/bin/ novell@xxx:/opt/novell/sentinel/bin> . ./setenv.sh novell@xxx:/opt/novell/sentinel/bin> PG_INSTALL=$ESEC_HOME/3rdparty/postgresql novell@xxx:/opt/novell/sentinel/bin> LOG_DIR=$ESEC_DATA_HOME/log novell@xxx:/opt/novell/sentinel/bin> cd $PG_INSTALL/bin/ novell@xxx:/opt/novell/sentinel/3rdparty/postgresql/bin> ./vacuumdb -v --full --username=dbauser --dbname=SIEM --table=evt_rpt_35003882
You should also be aware that when you upgrade sentinel, it is likely that the databases will be upgraded. Part of this process will create a copy of the database data directory, so you've suddenly increased this disk usage by double!! It can be easily cleaned post upgrade validation though.
data: This contains the actual event data. By having this as a separate volume/mount point, the Disk Space Usage allocation will only be for the size of this allocation and therefore the percentage limits will actually mean something. If you were to only mount point at
/opt/novell/sentinel/ then usage is consumed by the databases and logs so will continually try to offload to secondary because it will see itself as xx% full.
log: Pretty self explanatory.
More articles on my Website.