Idea ID: 1641489

Multi-Instance steps should act like thread pools

Status : Delivered
over 3 years ago

DESCRIPTION

If you use a multi-instnace step and set the "throttling" value to something other than 0, OO kicks off a number of branches/instances that are equal to the throttling value.  For purposes of this dicussion, let's pretend we've set throttling = 10.

This means that 10 instances are created and kicked off.  OO then waits for that block of 10 instances to all complete before it kicks off another block of 10.  For workloads with differing durations, this means that any block of 10 instances will only be as fast as the slowest branch in that block of 10 - which is very inefficient.

What should be happening is that when one instance completes, another should be spawned to take it's place, so you get more work done concurrently.

BENEFITS

  • Increased efficiency/reduced flow runtime

DESIGN

Make it work like a thread pool, not a batch system.

Labels:

OO - Platform
Parents
  •  - I, too, tried to work around it by:

    1. Creating a master "controller" flow that queried records, iterated through them and spawned subflows. and then re-ran another instance of itself when it ran out of records.  The problem here was the throttling mechanism didn't work.  WHen you spawn a flow and say "wait for it" with a max timer, it doesn't actually wait the "max time" before just moving on.  I could never get consistent results here.  I also had it loop through and query OO for status.  That caused high load on the system and generated 2GB of logs a day.  
    2. Created a "Sempahore" content pack (to replace the one that's available in the marketplace) which actually worked really well.  No busy wait, you could define the concurrency and it only added an overhead of about 0.2s to the flow execution time (to acquire/release locks).  The issue with it was in resuming a flow.  I found that when OO was heavily loaded (200 flows in the inbuffer), sometimes OO would corrupt "something" in the DB and the flow could not be resumed - some of the data required to resume the flow was missing.  This was on SQL Server and I suspect it's because the transaction to store the pause data failed because of a deadlock (which was subsequently rolled back).  Either way, I was left with a few flows that could never be resumed.

    In the end, my project opted to split the records that needed processing up into 3 worker flows, that each use multi-instance.  Basically, we ran out of money trying to architect/develop around this limitation and we had to accept the performance hit that we are now getting from multiple flows + multi-instance.

Comment
  •  - I, too, tried to work around it by:

    1. Creating a master "controller" flow that queried records, iterated through them and spawned subflows. and then re-ran another instance of itself when it ran out of records.  The problem here was the throttling mechanism didn't work.  WHen you spawn a flow and say "wait for it" with a max timer, it doesn't actually wait the "max time" before just moving on.  I could never get consistent results here.  I also had it loop through and query OO for status.  That caused high load on the system and generated 2GB of logs a day.  
    2. Created a "Sempahore" content pack (to replace the one that's available in the marketplace) which actually worked really well.  No busy wait, you could define the concurrency and it only added an overhead of about 0.2s to the flow execution time (to acquire/release locks).  The issue with it was in resuming a flow.  I found that when OO was heavily loaded (200 flows in the inbuffer), sometimes OO would corrupt "something" in the DB and the flow could not be resumed - some of the data required to resume the flow was missing.  This was on SQL Server and I suspect it's because the transaction to store the pause data failed because of a deadlock (which was subsequently rolled back).  Either way, I was left with a few flows that could never be resumed.

    In the end, my project opted to split the records that needed processing up into 3 worker flows, that each use multi-instance.  Basically, we ran out of money trying to architect/develop around this limitation and we had to accept the performance hit that we are now getting from multiple flows + multi-instance.

Children
No Data