Where did all of that free disk space go on my GroupWise server?


Back Ground information on the problem:

We had an older NetWare server, with Traditional Volumes, that was running low on disk space. Its entire purpose in life is to be a GroupWise Post Office. So I created a new server and performed a NetWare hardware migration (NetWare 6.5 SP7 with NSS volumes). I did this work on 12/23/2008. Since then, I've seen the amount of free space plummet in an alarming rate:

12/23/08 - 300 Gig
12/28/08 - 188 Gig
12/29/08 - 103 Gig
01/05/09 - 47 Gig

At that rate, my free disk space wouldn’t last the week.

There are no files to purge.

There is no antivirus product running on this server.

I have 370 Gig of data on this Post Office server, whereas it averages 30-40 Gig on other servers.

We have a 30 day retention policy and this Post Office is part of that policy. I reviewed the latest Gwcheck that pruned it back to 30 days and you can see it dumping emails older than 30 days.

We stage the backups to a different part of the server using DBCopy and therefore don’t need to worry about open files getting locked and don’t need to contend with the TSA modules for GroupWise.

Using TreeSize Professional, we were able to determine that there were no large (4 Gig) files anywhere. Everything looked just fine, but bloated.

Ah Ha!

We did notice that one user's userxxx.db file to be larger than the others. In comparison, the largest accounts hover around 27 MB. This user's userxxx.db file was at 127 MB. We also saw emails coming into this account from MAILER-DEAMON. Oh, oh.

In viewing the log files for the day, this user had over 300 emails in one day. Further examination of the log files showed that the majority (99%) of these emails were all from MAILER-DEAMON.

I broke into the user's account (we use LDAP for authentication and changing her PW in Edir was required so that I could examine her account) to check things out. She had a rule to forward all email home (sound familiar?), except that the address was not accepted by the ISP, causing them to reject the email and return the original message to the sender. Then the rule kicked in and auto-forwarded all of that email to her ISP, resulting in a permanent loop.

We deleted all of her email and have been struggling to get maintenance to occur long enough to clean up the Post Office without dramatically affecting performance. We've decided to schedule maintenance for Friday evening and just let it run through.

At the request of the engineer, we ran some stats on this user's account:

User stats: 41485 InBox, 43500 OutBox, 71 WasteBasket
Disk space management values: Size Limit - 0KB, Threshold - 0%
178484221 kbytes in use by user's mail

Put the commas in that last big number and you come up with 178Gig (don't get caught up in the actual number...). Out of 250 Gig consumed, this is huge chunk!

So we’re waiting for tonight’s Expire/Reduce to clean up that Post Office. I will also be running the same stats on all users and maybe look to set a limit of 4Gig for all users – this way, if an account does begin to run away, the system will lock it down at 4Gig.


How To-Best Practice
Comment List