Welcome Serena Central users! CLICK HERE
The migration of the Serena Central community is currently underway. Be sure to read THIS MESSAGE to get your new login set up to access your account.
jwilleke Trusted Contributor.
Trusted Contributor.
1018 views

Tunning on large DIB size

eDirectory 9.03 (On the way to 9.1.1)
RHEL 6.10 (on the way to RHEL 7.5)


Current DIB:
DIBFileSize: 104365670400 Bytes (104.3656704 GB)
DIBRflmFileSize: 49290947 Bytes
DIBRollBackFileSize: 104857600 Bytes
DIBStreamFileSize: 34428453 Bytes
TotalDIBSize: 104554247400 Bytes


Mostly reads along with a lot of binds and bind update information.
63,000 binds/hour but peaks over 100,000 binds/hour

Current Memory:
free -g
total used free shared buffers cached
Mem: 126 114 11 0 0 104
-/+ buffers/cache: 10 115
Swap: 7 0 7

Looking for any insights on tuning.

Thanks
-jim
Labels (1)
0 Likes
11 Replies
Knowledge Partner
Knowledge Partner

Re: Tunning on large DIB size

On 11/15/2018 8:24 AM, jwilleke wrote:
>
> eDirectory 9.03 (On the way to 9.1.1)
> RHEL 6.10 (on the way to RHEL 7.5)
>
>
> Current DIB:
> DIBFileSize: 104365670400 Bytes (104.3656704 GB)
> DIBRflmFileSize: 49290947 Bytes
> DIBRollBackFileSize: 104857600 Bytes
> DIBStreamFileSize: 34428453 Bytes
> TotalDIBSize: 104554247400 Bytes
>
>
> Mostly reads along with a lot of binds and bind update information.
> 63,000 binds/hour but peaks over 100,000 binds/hour
>
> Current Memory:
> free -g
> total used free shared buffers
> cached
> Mem: 126 114 11 0 0
> 104
> -/+ buffers/cache: 10 115
> Swap: 7 0 7


Store it on SSD? 🙂

That is a large DIB. Is it many small objects or large objects?
(Curious how many users/objects...)



0 Likes
jwilleke Trusted Contributor.
Trusted Contributor.

Re: Tunning on large DIB size

We are doing that.
0 Likes
jwilleke Trusted Contributor.
Trusted Contributor.

Re: Tunning on large DIB size

More than 7 million user entries.
0 Likes
Knowledge Partner Knowledge Partner
Knowledge Partner

Re: Tunning on large DIB size

On 11/15/2018 06:24 AM, jwilleke wrote:
>
> eDirectory 9.03 (On the way to 9.1.1)
> RHEL 6.10 (on the way to RHEL 7.5)
>
> Current DIB:
> DIBFileSize: 104365670400 Bytes (104.3656704 GB)
> DIBRflmFileSize: 49290947 Bytes
> DIBRollBackFileSize: 104857600 Bytes
> DIBStreamFileSize: 34428453 Bytes
> TotalDIBSize: 104554247400 Bytes


It would be interesting to know what they are storing in 34 GiB of stream
files; maybe pictures of people who take up the other 49 GiB of DIB space.

> Mostly reads along with a lot of binds and bind update information.
> 63,000 binds/hour but peaks over 100,000 binds/hour


As I am sure you know, binds are super-simple, and don't need much to
optimize once the binding (user usually) object is found other than
processing power (to do encryption stuff related to binds or even just
LDAPS traffic) or disk (to load up the user's credentials for comparison,
or to write back attribute when updated due to the login or login attempt).

> Current Memory:
> free -g
> total used free shared buffers
> cached
> Mem: 126 114 11 0 0
> 104
> -/+ buffers/cache: 10 115
> Swap: 7 0 7


Forget swap; it is crap.

You have a lot of RAM used for caching, so that's good and possibly
helping performance. You could possibly increase the eDirectory DIB cache
if your cache hit percentage is low, but how much that will help depends
on what is not working well now, so let's take a step back.

How, if at all, are they experiencing slowness? If they are not, but they
want a 0.05 ms bind to now take 0.04 ms, then they may want to consider
the cost/benefit ratio. Maybe it is worth it, but a single box handling
100,000 binds per hours already seems okay.

How are the users being found? Are they all in one container, so a simple
username goes directly to the DN of the object without a search, or are
they also doing some kind of search to find the object which is possibly
(in your case certainly) using an index? Maybe look at other types of
searches and be sure they are indexed; doing any kind of full DIB scan in
this environment would be suicide, so presumably things are already
indexed well.

If logins themselves cause tons of traffic you could always disable some
or all login attribute updates. On a normal login, out of the box,
eDirectory keeps track of things like the login time, the source network
address, etc. In a failed login case the intruder address, attempt
counts, etc. are all tracked. All of these involve writing back to the
DIB when doing a simple bind (something that feels like a simple compare
or read), so disabling these can improve performance overall by decreasing
the amount of data replicating all over. Of course, you may also lose
some nice features of eDirectory, like being able to tell if a user is
logging in at all, or if they are being attacked by intruders.

--
Good luck.

If you find this post helpful and are logged into the web interface,
show your appreciation and click on the star below.

If you want to send me a private message, please let me know in the
forum as I do not use the web interface often.
Knowledge Partner
Knowledge Partner

Re: Tunning on large DIB size

> How are the users being found? Are they all in one container, so a simple
> username goes directly to the DN of the object without a search, or are
> they also doing some kind of search to find the object which is possibly
> (in your case certainly) using an index? Maybe look at other types of
> searches and be sure they are indexed; doing any kind of full DIB scan in
> this environment would be suicide, so presumably things are already
> indexed well.


There is a trace setting, RECMAN? maybe, that shows Index work. I am
curious how big the indexes are for 7 million users. Is there a good
way to find out how much space Indices are taking up?

> If logins themselves cause tons of traffic you could always disable some
> or all login attribute updates. On a normal login, out of the box,
> eDirectory keeps track of things like the login time, the source network
> address, etc. In a failed login case the intruder address, attempt
> counts, etc. are all tracked. All of these involve writing back to the
> DIB when doing a simple bind (something that feels like a simple compare
> or read), so disabling these can improve performance overall by decreasing
> the amount of data replicating all over. Of course, you may also lose
> some nice features of eDirectory, like being able to tell if a user is
> logging in at all, or if they are being attacked by intruders.


I seem to recall this is in two places? One is NMAS related, other is
somewhere in iManager I cannot recall off hand.

Also curious, how long does a DIBClone take? 🙂

How long for a ndsrepair? 🙂

0 Likes
jwilleke Trusted Contributor.
Trusted Contributor.

Re: Tunning on large DIB size

geoffc;2490928 wrote:
> How are the users being found? Are they all in one container, so a simple
> username goes directly to the DN of the object without a search, or are
> they also doing some kind of search to find the object which is possibly
> (in your case certainly) using an index? Maybe look at other types of
> searches and be sure they are indexed; doing any kind of full DIB scan in
> this environment would be suicide, so presumably things are already
> indexed well.


There is a trace setting, RECMAN? maybe, that shows Index work. I am
curious how big the indexes are for 7 million users. Is there a good
way to find out how much space Indices are taking up?

> If logins themselves cause tons of traffic you could always disable some
> or all login attribute updates. On a normal login, out of the box,
> eDirectory keeps track of things like the login time, the source network
> address, etc. In a failed login case the intruder address, attempt
> counts, etc. are all tracked. All of these involve writing back to the
> DIB when doing a simple bind (something that feels like a simple compare
> or read), so disabling these can improve performance overall by decreasing
> the amount of data replicating all over. Of course, you may also lose
> some nice features of eDirectory, like being able to tell if a user is
> logging in at all, or if they are being attacked by intruders.


I seem to recall this is in two places? One is NMAS related, other is
somewhere in iManager I cannot recall off hand.



How are the users being found?
Tow primary methods. Search for uid=value and searches for a "String" custom GUID placed on all new users.

Are they all in one container?
No.
Currently there are 11 partitions and 40+ containers (Mostly due to mergers and different apps using same LDAP)
I think they should be in one location and one ROOT partition.
Please, tell me what do you think?

Also curious, how long does a DIBClone take? 🙂
Amazingly short time to create the clone files. (Must be the same process as used for DSBK)
Did one recently and it took less than 10 minutes to create the clone file.
Takes a longer to transfer across the network to other server.

How long for a ndsrepair? 🙂
3-4 hours so we obviously try to avoid.
We do several local entry repairs as we do see some regular occuring issues Which I blame on the way an application is used and that there are 1000s of "Group-Like" entries in separate partitions form users.
We dropped all the partitions from a lower environment and all the sync issues went away.

Questions, Comments and Suggestions are always encouraged!

Thank ALL of you for having taken your time to provide feedback. Your helpful comments are much appreciated.

-jim
0 Likes
Highlighted
Knowledge Partner
Knowledge Partner

Re: Tunning on large DIB size

jwilleke;2493824 wrote:

Are they all in one container?
No.
Currently there are 11 partitions and 40+ containers (Mostly due to mergers and different apps using same LDAP)
I think they should be in one location and one ROOT partition.
Please, tell me what do you think?


I strongly disagree here. With lots of objects, you want multiple partitions to make replication faster and more efficient.
jwilleke Trusted Contributor.
Trusted Contributor.

Re: Tunning on large DIB size

Thanks for the feedback.

All of my customer sites, since moving off NetWare (i.e. File and Print directories), are all LDAP access and have one ROOT partition.
Rarely ever had replication issues and almost never used ndsrepair.

Also, in this case, the replication in PILOT was 2-5 minutes with 4 servers each with 11 partitions

  • Admittedly smaller (5 million entries)
  • Admittedly fewer daily connections
  • Had to use ndsrepair regularly (more than once a week)


and when pilot was collapsed to one partition replication is measured in seconds.

Anyone else have thoughts on this subject?

-jim
0 Likes
Knowledge Partner Knowledge Partner
Knowledge Partner

Re: Tunning on large DIB size

On 01/17/2019 09:04 AM, jwilleke wrote:
>
> All of my customer sites, since moving off NetWare (i.e. File and Print
> directories), are all LDAP access and have one ROOT partition.
> Rarely ever had replication issues and almost never used ndsrepair.
>
> Also, in this case, the replication in PILOT was 2-5 minutes with 4
> servers each with 11 partitions
>
> - Admittedly smaller (5 million entries)
> - Admittedly fewer daily connections
> - Had to use ndsrepair regularly (more than once a week)
>
>
> and when pilot was collapsed to one partition replication is measured in
> seconds.


There may be good reasons for that difference. I do not think
partitioning was meant primarily to help with total tree replication time.
Partitioning lets you divide up objects as they are held on servers, so
more partitions means you now have the option of having some objects on
these servers, others on those servers, and all of them on this one server
right here, and that's about it (my understanding).

With regard to replication, when serverA tries to send data to serverB, it
is also able, simultaneously, to replicate to servers C, D, E, F, and G,
and assuming none of those servers are busy writing already, that
replication can happen from one to many as quickly as hardware will allow,
but that's for one partition. If you then have a second partition,
serverA needs to contact everybody to replicate that partition, and then
the third partition, and the fourth. This assumes eery partition has
changes, which is not always the case, but we're talking about comparing
high load, so assuming all partitions are getting changes.

The problem is that while serverA can send to many at once, every server
can only receive from one server at a time; many readers, one writer per
instance of eDirectory. As a result, it is not uncommon for replication
from serverA to many servers to have at least one of those servers return
-698 (I'm busy, come back soon) REPLICA_IN_SKULK, and then the process to
that one box must be tried again. With more servers in a partition, there
are more possibly sources of updates, and that could mean more time that
replication may be blocked by some other server replicating changes out to
peers. Many partitions also adds to the issue. Once serverA contacts
serverB to send a single-partition-tree's data, all data should replicate.
If there are a dozen partitions, it will take more connections, adding to
the possibility of the other server eventually being busy.

Back in Support I saw a box with 100+ partitions on it, and while it was
the only one with that many, it was that way because it replicated with
small sites around the world. It was ALWAYS replicating, and it was often
telling other boxes that it was too busy to replicate their changes a a
result. The heartbeat is set to once per hour by default, meaning that at
least once per hour a partition will be replicated if things are healthy,
but that's a per-partition timer, and if you have 100 partitions, then
each of those must make a connection per hour, and then if you have many
servers per partition, the overhead alone is significant.

eDirectory 8.8 had issues where entries going into its change cache would
start a (as I recall) five (5) second timer, after which time changes
would be replicated if they were sync-immediate changes (as many things
are; creates, password changes, etc.). If another event came in after
just four (4) seconds, the time was restarted. If another change came in
after four (4) seconds, the time restarted. This was contrary to how
things were supposed to work. This is one of the MANY performance fixes
built into eDirectory 9.0 and later by default, so replicating times in
9.x are much better than they were in 8.8.

Another change in 9.x is the increased NCP packet size possibility, so
replicating lots of data is much faster for that reason too.

--
Good luck.

If you find this post helpful and are logged into the web interface,
show your appreciation and click on the star below.

If you want to send me a private message, please let me know in the
forum as I do not use the web interface often.
Knowledge Partner
Knowledge Partner

Re: Tunning on large DIB size

jwilleke;2493872 wrote:
Thanks for the feedback.

All of my customer sites, since moving off NetWare (i.e. File and Print directories), are all LDAP access and have one ROOT partition.
Rarely ever had replication issues and almost never used ndsrepair.

Also, in this case, the replication in PILOT was 2-5 minutes with 4 servers each with 11 partitions

  • Admittedly smaller (5 million entries)
  • Admittedly fewer daily connections
  • Had to use ndsrepair regularly (more than once a week)


and when pilot was collapsed to one partition replication is measured in seconds.

Anyone else have thoughts on this subject?

-jim


The number of objects is only one interesting factor. The churn on those objects is much more interesting. You could have a billion objects, but if only three of them ever change, that's a much different picture from 100K objects where 20K of them change every second.

Also, what replication strategy are (were) you using? You can tune the method and number of threads.

With lots of replication, you also really need to have a look at the disk subsystem underneath the DIB and see how it is performing. If you're stacking up lots of writes waiting for disk, performance is going to inhale strongly.

ndsrepair more than once a week? That's either crazy or a symptom of a deeper problem.
0 Likes
Knowledge Partner
Knowledge Partner

Re: Tunning on large DIB size

On 11/15/2018 8:24 AM, jwilleke wrote:
>
> eDirectory 9.03 (On the way to 9.1.1)
> RHEL 6.10 (on the way to RHEL 7.5)
>
>
> Current DIB:
> DIBFileSize: 104365670400 Bytes (104.3656704 GB)
> DIBRflmFileSize: 49290947 Bytes
> DIBRollBackFileSize: 104857600 Bytes
> DIBStreamFileSize: 34428453 Bytes
> TotalDIBSize: 104554247400 Bytes



One thing to consider is to trim out all extra NMAS methods. There is
some bug that causes the methods to reload under certain circumstances.
It should be fixed, but I think they are checked on login, so wastes time.


0 Likes
The opinions expressed above are the personal opinions of the authors, not of Micro Focus. By using this site, you accept the Terms of Use and Rules of Participation. Certain versions of content ("Material") accessible here may contain branding from Hewlett-Packard Company (now HP Inc.) and Hewlett Packard Enterprise Company. As of September 1, 2017, the Material is now offered by Micro Focus, a separately owned and operated company. Any reference to the HP and Hewlett Packard Enterprise/HPE marks is historical in nature, and the HP and Hewlett Packard Enterprise/HPE marks are the property of their respective owners.