tifoso Absent Member.
Absent Member.
878 views

Database Layout - how to do it right...

As I read in AS-Installguide 4.5:


- Raid 5 not recommended --> So use Raid 01 / 10

  1. Volume : System stuff
  2. Volume : Database Stuff
  3. Volume : Redo Volume - Redo and Redo Archive on a Separate Partition
  4. Volume : Archive Volume

Does anyone have experience with splitting the databse volume also into separate raid arrays for data and index files for ultimate performance?

Regards, tifoso

Labels (3)
0 Likes
Reply
6 Replies
tifoso Absent Member.
Absent Member.

Re: Database Layout - how to do it right...

to get a little bit in detail. I really would ask if someone has knowledge about the

  • physical splitting (have seperate raid arrays for data and index files)
  • logical partitioning (do it on the os level and not have it in "hardware)
  • a mix of the two above

any experience is welcome...

best wishes from switzerland

0 Likes
Reply
Highlighted
jbur Absent Member.
Absent Member.

Re: Database Layout - how to do it right...

Recently I split my database up the following way, and it runs great.

vol1 - index (RAID10 5+5)

vol2 - data  (RAID10 5+5)

vol3 - redo  (RAID10 2+2)

vol4 - undo, temp, systemdata, systemindex (RAID1)

vol5 - offline archives (RAID5)

Each RAID group is it's own LUN.  I was using LVM for awhile, but I think the software striping hurts performance a bit so I stopped using it.

Undo and temp seem to be more active than I originally anticipated, so I may move them to RAID10 as well in the future.

-Joe

0 Likes
Reply
jvalinotti Absent Member.
Absent Member.

Re: Database Layout - how to do it right...

Having the OS perform the RAID exhibited problems in performance when used for the Database partition.  Especially when the feeds increased.  Long report generation delays, and long query results in general. 

Stick with the hardware RAID 1+0, you will be happy later that you did.

0 Likes
Reply
konrad.kaczkows1
New Member.

Re: Database Layout - how to do it right...

vol5 - offline archives (RAID5)

I deosn't agree:

- after reactivate - partition file stays always on that raid - never import into DATA tablespace

Second thing is archiving and reactivating process:

Archiving:

1. create new tablepace on vol5

2. dump data from EVENT tablespace to new tablespace on vol5

3. unlink tablespace

4. depending on archive type (zip, tar.gz, tgz, z.) compressing

5. creating control sum and update database

Reactivating (in this scenario tar.gz file )

1. ungzip

2. untar

3. check control sum

4. import to database (import meand link unpacked file !!! - it does not import)

5. tar previously unpacked directory

6. gzip

7. update sum

So,

if you are going to put it to slow disks and compress  - you have up to 5 slow operations taht get a lot of I/Os.

K.

K.
0 Likes
Reply
joao.farias Absent Member.
Absent Member.

Re: Database Layout - how to do it right...

You should look at this: https://protect724.arcsight.com/docs/DOC-1466 . I think it may help you.

0 Likes
Reply
Outstanding Contributor.. douglas.baker@h1 Outstanding Contributor..
Outstanding Contributor..

Re: Database Layout - how to do it right...

Joe provides valuable insight.

The UNDO tablespace is coupled with the REDO log files. Oracle guarantees that a query will have a stable view of the database during the life of that query. In order to guarantee that the query is consistent blocks that satify the query that are also being modifed are copied to the UNDO tablespace which allows Oracle to ensure that a pristine immutable copy is being queried.

It is fairly safe to assume that queries to older data are no longer subject to modification, however don't worry Oracle has you covered -> UNDO. Well, safe is a relative term as you are being successful at have enterprise wide synchronization of your clocks and all of your feeds are very nearly real time aren't they? Again, not to worry because Oracle simply does the protection....

Now, change that around and consider a default Active Channel which is hammering away at the last 2h ($NOW - 2h : $NOW) of data and you understand that there is a lot of modification occuring due to the persistent inbound stream of new events. Your query, at least initially, spans all of those most recent blocks being filled with new events until the query is resolved and the needed blocks of events are accessed. This could result in higher usage of the UNDO as Oracle protects the query consistency.

The TEMP tablespace is used to resolve data ordering related to query operations, note that these query references are generally SQL 'select'. Your process global area potrion of the Oracle memory cache will be used for 'small' ordering, i.e. how you specify that the data is to be 'sorted', however when the query becomes large enough TEMP comes to the rescue. Let's go back to that default Active Channel on you enterpise reaching back 2h and you are doing 500EPS, about 3.5M events in your query window which might get pruned down to what you filter defines (note that the default filter == ALL), and TEMP could become involved. In general, and such terminology could be dangerous when taken as a hard rule instead of a guideline, TEMP is lightly used.

So, when considering what Oracle file objects to seperate you may wish to consider moving UNDO and TEMP onto a seperate and shared volume. The 'good' news is that both are relatively small in size relative to your overall events data/index storage and allocations sizes under 100GB each are common so a storage volume (SAN LUN) of about 200GB should do nicely.

But don't take this as gospil, find out for yourself. For those of you running Microsoft based database servers the sysinternals suite includes the filemon utility which can provide you with deep insight into just what you actual arcsight database instance is doing in your environment. Run filemon and set the filter to pass only control or ctl (for you instance control files), REDO (well, for your REDO logs), and dbf for your actual database tablespace files. What you will see is Oracle managing its file system servicing your ESM. You will notice such things are REDO hits very often (if not quite always) also including UNDO hits, control constantly hit, of course the event data and index being hit. Also, not included in any of the above notice just how hard system data and index are getting hit, which I did not discuss above because the focus was on '1st' connfiguration tuning effort to spread out files on volumes to increase i/o.

Anyway you will have some 'proof' of what is actually happening. The next obvious step is to do some controlled 'testing', of course not on your live database, in which you trigger a specific cause, i.e. insert new events or active channel query, and watch the i/o that results.

Subtle, but already out of the bag, is that I would consider seperating system data and index also.

I have always preferred to seperate data from index but this isn't always practical as those two tablespaces grow at different rates depending on your event mixture and storage allocation is always the challenge. Actually all of the standard guidelines give you an excellent foundation as the point from which you start growing your deployment. When you data storage options begin to allow you to seperate things, i.e. you get additional disks or LUN's (mount points) using your testing will help you devide what to seperate first. When your storage gets large enough in diversity, i.e. you have many disks or LUN's, then seperating data from index will become less beneficial because i/o is becoming more spread out due to the larger number of volumes being used.

On the topic of data .vs. index just know that in general, there is that word again, Oracle will access indexes before it accesses data blocks, well that IS what indexing is all about, so when combined with your actual i/o monitoring (filemon) you are also beginning to recognize that there is a clear rhythm to what Oracle is doing in its write and read operations.

Enjoy!

A quick comment of the partition archiving.

I would suggest that you stay with RAID5.

A little note on reactivation. While it is correct that the data tablespace is not imported into the existing online data tablespace during archiving indexes are not preserved and thus during reactivation indexes are generated for the reactivated tablespace partition. This index partition is created within your online index tablespace (event index) so there is a 'cost' to your online index capacity during the live of the reactivated partititon. Because the data partition is joined to the online tablespace there is little cost to reactivating that data partition, other than of course the referenced decompression.

A little note of deactivating a reactivated partition. It is not recompressed, unless of course something has changed in the latest ESM. Also the created index space in your online index tablespace is simply dropped, i.e. it is a partition, and releases that used space into the free tablespace for the index.

While you can readily 'measure' performance differences between concantenated volumes (not raid0), striped volumes (raid0), mirrored volumes (raid1), mirror stripe (raid10) and raid5 during write know that raid5 is a raid0 volume with embedded parity and has high performce characteristics during reading when not in a reduced ( failed member) state. Also since there is so much going on with a partition archive, i.e. there is a lot of reading of the online partition occuring, unless you have very distributed and high performance storage, i.e. not the default but 'evolved' as we have discussed above, then the raid5 archive volume will not introduce significant delay to the actual partition archiving task just because it is raid5.

So, you get the large storage capacity and parity protection from raid5 with little performance impact on the archiving partition storage.

The partition arhive storage volume isn't intended to be an infinite storage volume for all your events but rather a staging area between online first order storage and tertiary offline (tape vault?) storage. You have to implement your own archive backup/restore policies.

So, that is how I see things! I have offered insight on how you can evaluate your instance so that you are operating based on your event and usage profile.

Always looking to update knowledge so if some corrections are needed we will all benefit!

Best regards,

Doug

0 Likes
Reply
The opinions expressed above are the personal opinions of the authors, not of Micro Focus. By using this site, you accept the Terms of Use and Rules of Participation. Certain versions of content ("Material") accessible here may contain branding from Hewlett-Packard Company (now HP Inc.) and Hewlett Packard Enterprise Company. As of September 1, 2017, the Material is now offered by Micro Focus, a separately owned and operated company. Any reference to the HP and Hewlett Packard Enterprise/HPE marks is historical in nature, and the HP and Hewlett Packard Enterprise/HPE marks are the property of their respective owners.