GPF in GWPOA with latest greatest 24.3 build but very old Groupwise data (oldest user creation date 2002)

Hello there to the dying breed of Groupwise Admins!

The post office run since 2023 with a 18.4.2 build

some GPF happened in the past and most of the time we rebooted the VM and for month the post office has kept on running, but on Monday the 26th of August the GWPOA of two of the four large post offices started to GPF on startup of the Groupwise service continuously.

After 3 days of nearly continuous downtime (many thanks to the sleepy support engineers of the Groupwise front line) a back line engineer from Rotterdam renamed the NGWDFR.DB for the tracking of messages sent with the delay delivery send option and on one of the two large post offices remain stable again.

The other one with 470 users, +1.2 million of files and more than 2 terabytes disk space used for /grpwise/po data files kept to GPF every few minutes.

I could solve the GPF (General Protection Fault) on my own with a 24 hour Team Viewer dial in from the Hotel during vacation running a standalone GWCHECK with all options with stopped Groupwise service on the host, took about 12 hours.

But i can dupe the GPF on a test host taking over the data with dbcopy as soon as the GWCHECK for the content with fix problems start to rebuild the NGWDFR.DB defer database

My assumption is that there must be "dangerous messages" in the post office with delivery date still in the future that corrupt the defer database as soon as the GWCHECK content check with fix problems find them and populate the defer database with the information from those weird messages.

We have a lot of users that are really fond of using the delay delivery send option and complain about the C0D5 when they try to send a message with delay delivery active and the NGWDFR.DB is not there.

Any idea how to come out of this s(h)ituation?

My only desperate last resort would be to create a new post office and move every user from the corrupted post office to the new one, but the effort is huge since you have to recheck continuously with which user move the error goes from one post office to the other.

so far - so good (or bad), Stefano

Parents
  • 0

    Hi all, here the standalone GWCHECK with ngwdfr.db renamed to ngwdfr.dba

    STRUCTURAL VERIFICATION of system databases
    STRUCTURAL VERIFICATION of database ngwguard.db
    - Database is structurally consistent
    Reading Guardian Database store catalog info
    Processing Post Office = PO02, Store Catalog Path = /grpwise/po02prod
    STRUCTURAL VERIFICATION of database /grpwise/po02prod/ofmsg/ngwdfr.db
    - Attempting to correct structural problem in database
    Problem 39- Unknown file ngwdfr.dba - 77824 bytes, 09/30/24 10:15
    NOTE- the timestamp on this file is recent, and may reflect a
    temporary mismatch between the file system and the databases.
    - File is too recent- will not be deleted
    Problem 39- Unknown file ngwdfr.dbb - 217088 bytes, 09/24/24 17:35
    NOTE- the timestamp on this file is recent, and may reflect a
    temporary mismatch between the file system and the databases.
    - File is too recent- will not be deleted
    Problem 39- Unknown file ngwdfr.dbc - 73728 bytes, 09/24/24 10:38
    NOTE- the timestamp on this file is recent, and may reflect a
    temporary mismatch between the file system and the databases.
    - File is too recent- will not be deleted
    Problem 39- Unknown file ngwdfr.dbd - 77824 bytes, 09/27/24 13:56
    NOTE- the timestamp on this file is recent, and may reflect a
    temporary mismatch between the file system and the databases.
    - File is too recent- will not be deleted
    Problem 39- Unknown file ngwdfr.dbe - 217088 bytes, 09/24/24 17:35
    NOTE- the timestamp on this file is recent, and may reflect a
    temporary mismatch between the file system and the databases.
    - File is too recent- will not be deleted
    Problem 39- Unknown file ngwdfr.dbf - 221184 bytes, 09/30/24 00:41
    NOTE- the timestamp on this file is recent, and may reflect a
    temporary mismatch between the file system and the databases.
    - File is too recent- will not be deleted
    Error 0x8209 opening /grpwise/po02prod/ofmsg/ngwdfr.db
    - Beginning rebuild for database ngwdfr.db
    Error 26- DbRebuild error STORE_FILE_NOT_FOUND (0xC05D)
    - Store will be dropped from guardian catalog so it can be re-created
    *WARNING*: no records were recovered from database during
    rebuild process. Try to restore an earlier backup of the
    file, or else run CONTENTS check to repair system folders.
    Validating file references in database:
    Error 18- MESSAGE database open error INVALID_STORE_NUM (0xC067) on n
    Suggestion- Try physical check/rebuild of database
    PROCESSING COMPLETED- total processing time: 0:00:00

    *********************************************************************
    Uncorrectable conditions encountered:
    CODE DESCRIPTION COUNT
    ---- -------------------------------------------------- -----
    18 Message database open errors....................... 1
    26 Errors trying to do structural database rebuild.... 1
    Correctable conditions encountered:
    CODE DESCRIPTION COUNT
    ---- -------------------------------------------------- -----
    39 Unrecognized or invalid files in mail directories.. 6
    *********************************************************************

Reply
  • 0

    Hi all, here the standalone GWCHECK with ngwdfr.db renamed to ngwdfr.dba

    STRUCTURAL VERIFICATION of system databases
    STRUCTURAL VERIFICATION of database ngwguard.db
    - Database is structurally consistent
    Reading Guardian Database store catalog info
    Processing Post Office = PO02, Store Catalog Path = /grpwise/po02prod
    STRUCTURAL VERIFICATION of database /grpwise/po02prod/ofmsg/ngwdfr.db
    - Attempting to correct structural problem in database
    Problem 39- Unknown file ngwdfr.dba - 77824 bytes, 09/30/24 10:15
    NOTE- the timestamp on this file is recent, and may reflect a
    temporary mismatch between the file system and the databases.
    - File is too recent- will not be deleted
    Problem 39- Unknown file ngwdfr.dbb - 217088 bytes, 09/24/24 17:35
    NOTE- the timestamp on this file is recent, and may reflect a
    temporary mismatch between the file system and the databases.
    - File is too recent- will not be deleted
    Problem 39- Unknown file ngwdfr.dbc - 73728 bytes, 09/24/24 10:38
    NOTE- the timestamp on this file is recent, and may reflect a
    temporary mismatch between the file system and the databases.
    - File is too recent- will not be deleted
    Problem 39- Unknown file ngwdfr.dbd - 77824 bytes, 09/27/24 13:56
    NOTE- the timestamp on this file is recent, and may reflect a
    temporary mismatch between the file system and the databases.
    - File is too recent- will not be deleted
    Problem 39- Unknown file ngwdfr.dbe - 217088 bytes, 09/24/24 17:35
    NOTE- the timestamp on this file is recent, and may reflect a
    temporary mismatch between the file system and the databases.
    - File is too recent- will not be deleted
    Problem 39- Unknown file ngwdfr.dbf - 221184 bytes, 09/30/24 00:41
    NOTE- the timestamp on this file is recent, and may reflect a
    temporary mismatch between the file system and the databases.
    - File is too recent- will not be deleted
    Error 0x8209 opening /grpwise/po02prod/ofmsg/ngwdfr.db
    - Beginning rebuild for database ngwdfr.db
    Error 26- DbRebuild error STORE_FILE_NOT_FOUND (0xC05D)
    - Store will be dropped from guardian catalog so it can be re-created
    *WARNING*: no records were recovered from database during
    rebuild process. Try to restore an earlier backup of the
    file, or else run CONTENTS check to repair system folders.
    Validating file references in database:
    Error 18- MESSAGE database open error INVALID_STORE_NUM (0xC067) on n
    Suggestion- Try physical check/rebuild of database
    PROCESSING COMPLETED- total processing time: 0:00:00

    *********************************************************************
    Uncorrectable conditions encountered:
    CODE DESCRIPTION COUNT
    ---- -------------------------------------------------- -----
    18 Message database open errors....................... 1
    26 Errors trying to do structural database rebuild.... 1
    Correctable conditions encountered:
    CODE DESCRIPTION COUNT
    ---- -------------------------------------------------- -----
    39 Unrecognized or invalid files in mail directories.. 6
    *********************************************************************

Children