Parallel processing of subscriber channel events

Idea ID 2821883

Parallel processing of subscriber channel events

IDM process events in the subscriber channel 1 at a time.

For each event on the subscriber channel, the IDM run over all policies on the subscriber channel, and when it is done, it continue to the next event.

The following image describes the subscriber channel processing:

Single processing modeSingle processing mode

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

In environments where there are millions of events to the subscriber channel, the processing time is very long, and can take weeks.

I want to offer a solution that will make the subscriber channel faster.

It will be faster if it will be split into multiple identical subscriber channels, running the same logic. This feature will allow us to process multiple events in parallel - 1 event for each channel (thread\process).

The following image describes the offered solution:

Multiprocessing modeMultiprocessing mode

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

I know that it doesn't fit all needs, but one will be able to activate this solution based on a configuration of the number of simultaneous processing channels (default should be 1).

4 Comments
Honored Contributor.
Honored Contributor.

For some driver types, there exists a fan-out driver that works similar to this manner.  There also exists a priority sync channel in the shims now that allow for specific event types to have fastpasses (skip the queue) and get processed immediately.

The primary reason NetIQ hasn't implemented what you are referencing is because IDM is an event driven system and in some cases the different events must be processed in order.

For example, if an account is created, then has additional attributes added, followed by the setting of the universal password.  In your model, each of these three events would be processed in parallel and the modifies and password sync event would get tossed due to a missing association.  All 3 events in a typical creation event will be milliseconds apart.  

Many other IDM systems are reconciliation based, so they are able to process all records, then send all of the events as a snapshot.  Because it is a snapshot, not a live look, they are able to split into multiple threads.

We have implemented something similar, but the way it works is you have multiple connectors to the same system, then you have a logical divide in scoping.  For example, you can create one driver for usernames that start with A-M, then another for N-Z.  This will work if you do not have interdependencies on those user objects and would allow for you to specify which driver handles which events.  While all drivers will see the initial event, one of the first policies (subscriber event transform) will evaluate whether to veto the event or process it.  As a tip, to avoid keeping multiple copies of the same driver in sync, you could simply enable it on multiple servers in the tree, the GCV's and configuration parameters are server specific, so each server in your tree could be servicing a different subset.  Packages are also a great way of ensuring multiple drivers don't have their policies get out of sync.

Super Contributor.
Super Contributor.

I understand what you are saying that IDM is a triggered based system, and that events usually need to be processed in order, but I want to be able to decide if I'm allowing parallel processing or not, and let me handle to outcomes in the policies.

I understand what you did with creating multiple drivers that shares the same code, but it is not simple enough and hard to maintain (If I want 100 processes I need to think of a division function that divide the events for 100 drivers).

Knowledge Partner
Knowledge Partner

If you look at the parellization efforts they made in RRSD in 4.8 you can see a couple of different approaches.

They split the processing into multiple threads, for different types of operations. (Roles vs Resources vs users vs Dynamic groups (Which I think were a seperate thread in the past as well))

Then they support your adding an XML attr disjoint-set to the operation node, which allows it to use a distinct thread.  One per disjoint set value, but you have to select how to splt it.  Obvious example is by OUs. Failing that, can use A-Z to get 26 potential threads but that is not very balanced in real names.  And so on.

 

Knowledge Partner
Knowledge Partner

I've built something along those lines, running 12 drivers in parallel. You just have to be very careful to correctly scope the subscriber event transform so that one object is processed by exactly one driver.

The opinions expressed above are the personal opinions of the authors, not of Micro Focus. By using this site, you accept the Terms of Use and Rules of Participation. Certain versions of content ("Material") accessible here may contain branding from Hewlett-Packard Company (now HP Inc.) and Hewlett Packard Enterprise Company. As of September 1, 2017, the Material is now offered by Micro Focus, a separately owned and operated company. Any reference to the HP and Hewlett Packard Enterprise/HPE marks is historical in nature, and the HP and Hewlett Packard Enterprise/HPE marks are the property of their respective owners.