This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Archiving speed

Hi,

I´m curious how many messages per seconds is normal on VMware ( average hardware , plenty of RAM, CPU , SAN etc ) , I had never more than 1,2 . Latest Retain / GroupWise, DB are PostgreSQL / MariaDB .

David

  • 0

    Good morning,

    that is the classic answer, well it depends.  You say average hardware but there are a ton of variables that come into play.

    cpu, number of cores, disk type, response time of disk etc.

    Also, independent of the underlying infrastructure supporting the hypervisor, what about the disk setup for retain?  

    All on one drive, several?

    database, do you have that on separate disk and segregated across disk or disks?  When i say disks, i am referring to drives.

    also the VM’s themselves.  Number of cores and cpu allocated

    SAN connection assuming fiber or 10GBs, 40Gbs or?

    this is somewhat borderline support question that can be checked against Config’s we have seen.  There should be a little more dialog to help get you to where you want to be.

  • 0   in reply to 
    I was only curious about the numbers, I have multiple customers with totally different hardware, different kind of setups, from all-in-one single server setup, to everything separate .. DB, Worker and Server and speed is about the same ...

    David

  • 0 in reply to   

    Hi David,

    I am getting 1-2 messages per second with the PAM tool, a 65,150 user archive is 10gb and is going to take about 6.5 hours. We have VMware with enough RAM on every guest that there is no swapping and fiber channel that I have seen with ATTO run at 800 megabytes per second.  The import is so slow we could have 10 sites with multiple PAM sessions over a WAN and the bottleneck would NOT be the WAN.  Something is wonky and very slow and I do not know how to pinpoint it.

    It may have to do with the RETAIN system trying to do DEDUPE on every message as the message is presented.  It should be that the batch is pumped to the server and at night or the weekend have a task that de dupes.  Like a staging table where the first stage is stored and NOT de duped so that it can get the process done and then have a sub process finish.  In human terms RETAIN has one stomach to process all the food where us sophisticated humans have a stomach, small and large intestines. LOL

  • 0   in reply to 

    Joe,

    Retain doesn´t do depud after archiving, but during the job. Every message get hash and this hash is compared with already stored messages during archive job. Maybe this what is slowing down the whole process.

    I have customer with 100 users and the speed isn´t much different from customer with 2 000 users. 

    David

  • 0 in reply to   

    It needs to be user selectable, at least on the PAM import.

     

    Is this on the roadmap or a future version?

     

    Does MicroFocus use RETAIN, i.e. do they dog food the software?  The system managers must hate retain because of this issue.  You can only support x users as they want to archive x messages per day and it takes 1 second per message and after that we can start the backup. LOL

  • 0 in reply to 

    I have drawn PM's attention to your post.

    Please allow him time to respond..

     

    Thanks

    Tarik

  • 0   in reply to 

    Hmm,

    I have large environments to use Retain; i.e. universities. Some of them archive every night which is pretty fast. Some of them archive at weekend; in this case many thousand items per run. Even if there are more than 100.000 items to archive, it will happen within a few hours.

    If you have to archive local GroupWise archives, it is (a lot) slower. But in this case it will help to run more than one RetainWorker. I did this several times especially when Retain has been introduced to replace good old client based GroupWise archives.


    Use "Verified Answers" if your problem/issue has been solved!

  • 0 in reply to   

    Hi Diethmar,

     

    We archive about 15,000 mail items per night and it takes about 300 minutes.  So that is about .83 messages per second.  I am told that is normal for it to be that slow.  My PAM imports from GroupWise 18.01 archives are also that slow.  I have only been using Retain with GroupWise since the first version of GW18 and Retain 4.7.  We have everything running on SUSE Linux per the Micro Focus documentation.

    Are you running on Windows and all on a single box? Did you find settings you had to tweak?

    Thanks,

    Joe

     

  • 0   in reply to 

    Hi Joe,

    i.e. one site archives about 30.000 items each weekend. Retain needs around 5 hours for this job. They have three post offices and about 4000 users. They have two classes of users and two different jobs for each kind of class. The numbers above are for the heavier job. Unfortunately the have only one RetainWorker (not the way I use usually). I think that Retain space in the background is a little bit more than 1 TB (Retain version 4.9 since two weeks - 4.7 before because of operation system)

    The other university has more than 10 post offices and they use 4 or 5 RetainWorkers to run jobs each night. They archive each night (I cannot access the numbers right now - maybe I return with more information later on). As I remember none of the jobs needs more than one hour. In the background Retain occupies more than 1.7 TB. (Retain version 4.9 since one week now - 4.7 before because of operation system).

    I did not adjust any options or settings - I used default Retain values. If there is more than one post office then I play around with more retain workers.


    Use "Verified Answers" if your problem/issue has been solved!

  • 0 in reply to   

    Hi,

    I'm new to this community. And my apologies if it's not done to revive and/or hijack an old thread, but my questions really relate.

    I've inherited an undocumented Retain 4.9.0.1 setup. Every night some 800 messages from 19 mailboxes, some 200MB in size are added to Retain and are findable next day. So the setup works. However this proces takes for ever... like 4 to 5 hours on reasonable hardware. All (sata) flash array, 6 core VM with 16 GB ram on an idle vSphere server.

    Several things I would like to disclose and or point out:

    • this is an old installation which has been upgraded for years.
    • the filesystem is ext3 and not the recommended xfs, but has plenty of free space and inodes
    • network has only a percent load, but can handle enough when tested with iperf.
    • the disk usage on the retain VM is insane. To proces these few messages, the delta of the VMDK is 14-16GB that's right not mega but gigabytes to store 200Mb of data.
    • From the logs it seems that all messages in groupwise are processed not just the new ones.

    Backing up the VM via snapshot's is a recommended procedure in the installation guide page 41. However these insane delta's result in rather expensive offsite backups.
    I'm coming from Solaris so I really miss all the diagnostic tools (DTrace) to troubleshoot this system. I dont want to install any trace tools which have known performance impact. All I can see is the amount of writes to the filesystems and the iops and delta vmware reports. I cannot see which process issues these writes and to which files. Is it mysql? Is it lucene? Is it...?

    The sum it up:

    • Is it normal that Retain needs to rewrite 70-80x as much off it's own storage for storing the amount of actual data?
    • Is it by design that Retain needs to read the entire Groupwise PO instead of just the new messages?

    Any suggestions into improving this would be most appreciated.

    Thank you all for your time and interest,

    Benny