OO 10.70 - Unable to recover smoothly on java.lang.OutOfMemoryError: Java heap space Error
It occured to me several times that when I encountered java.lang.OutOfMemoryError: Java heap space error, OO cannot recover even if I restart the service... keeps on crashing and crashing and crashing... until I restore the database...
This is not good... MS support team do not know exactly what to do but suggest to restore the database when everything else fails...
In the old OO 9.07, when I encounter this error, everything goes back to normal operations after I restart the service but this is not the case for OO 10.70 so I need your advise... what to do? which table to check? what record to delete? something like this because this new version of OO is data driven...
THANKS IN ADVANCE!
The reason this happens is that most likely the heap space error is generated by a running flow. When central crashes with that error it recovers and attempts to continue the execution where it left off then crashes again.
The only exception to this rule is when there is some missleading data in the db for active executions and central keeps trying to resume something that doesn't exist.
In both cases the solution mentioned works, however there can be other ways to solve this. If you have a support account open a ticket with them next time this happens so they can find the cause of this crashes as well.
Hope this helps,
The problem was, even the support engineers who assisted me on my trouble ticket did not know what to do... they asked me to truncate table OO_RUNNING_EXECUTION_PLANS but new problems came out after starting central (e.g. Flows did not change the status to Completed, always in Running status even though it was completed.... so I really do know where to find help and decided to post in this forum hoping it can shed some light...
Judging by the description of things it is possible that some of the execution plans that were deleted also had flows in a running state. The running state reffers to the following statuses: RUNNING, PAUSED, PENDING_PAUSE and PENDING_CANCEL.
If this is indeed the case, then support is the way to go in solving this issue since it involves manually terminating and clearing all the inconsistent flow data from the database and workers., maybe further help is required on top of that.
If I were to cascade delete an execution plan, what tables to i need to check for data existence and state?
It will be a BIG HELP...