Knowledge Partner
Knowledge Partner
231 views

Cost of trace example

I have been screwing around with Roles in Policy for some fun ideas
someone suggested to me...

Anyway, doing a lot of reading roles, and processing them in a useful way.

My test system this takes about 25 seconds to process per trace log.

Turn trace off on those rules and went down to under 1 second.

Wow! 25 fold improvement in performance. I knew it was expensive to
have trace on, but that is a lot.

Doing a lot of screwing around with nodesets, and Append XML element
tokens, so sort of worst case for Trace, but still.

Useful data point to consider is all I am saying. 🙂
Labels (1)
0 Likes
12 Replies
Knowledge Partner
Knowledge Partner

Re: Cost of trace example

Geoffrey Carman wrote:

> Wow! 25 fold improvement in performance. I knew it was expensive to have
> trace on, but that is a lot.


Fully second that, typically I see 10x - 40x faster processing in trace level 0
compared to 3, depending on what the driver does exactly. The more complex
processing in DirXMLScript the higher the gain. On the other hand, querying app
or idv is expensive and does not speed up at all in trace level 0, so trying to
be efficient when reading and writing is another and independent performance
factor.
______________________________________________
https://www.is4it.de/identity-access-management
0 Likes
Knowledge Partner
Knowledge Partner

Re: Cost of trace example

On 3/10/2014 12:06 PM, Lothar Haeger wrote:
> Geoffrey Carman wrote:
>
>> Wow! 25 fold improvement in performance. I knew it was expensive to have
>> trace on, but that is a lot.

>
> Fully second that, typically I see 10x - 40x faster processing in trace level 0
> compared to 3, depending on what the driver does exactly. The more complex
> processing in DirXMLScript the higher the gain. On the other hand, querying app


Specifically, if you are doing stuff with nodesets and loops in particular.

When you loop over a nodeset, a representation is 'rendered' on every
loop which eats CPU/time.

Mine is 3 queries, but LOTS of looping over the results, and lots of
nodeset mucking about.


> or idv is expensive and does not speed up at all in trace level 0, so trying to
> be efficient when reading and writing is another and independent performance
> factor.


IDV querying if repetitive, is a bit better since it will cache the
results and reuse them instead of doing new queries.

At some level, if you know you will need a ton of stuff later in the
policy, it is possible that by querying once for it all early (say that
is even possible) then a single query for many attrs is more efficient
than many smaller queries.

But I cannot prove this... Thinking about how I could test that now...


0 Likes
Anonymous_User Absent Member.
Absent Member.

Re: Cost of trace example

Geoffrey Carman wrote:

> IDV querying if repetitive, is a bit better since it will cache the results and reuse them instead of doing new queries.
>
> At some level, if you know you will need a ton of stuff later in the policy, it is possible that by querying once for it all early (say that is even possible) then a single query for many attrs is more efficient than many smaller queries.


If you use the query token (I think also the relevant attribute tokens), then the engine tries it's best to automatically perform this kind of optimization.
You can often see this in a trace, that after for example the second time through a series of inefficent queries - the engine pre-queries for all the attributes (tied to that specific object class/object-name/entry-id) that it noticed were queried last time.

--
If you find this post helpful and are logged into the web interface,
show your appreciation and click on the star below...
0 Likes
Knowledge Partner
Knowledge Partner

Re: Cost of trace example

On 3/10/2014 1:44 PM, Alex McHugh wrote:
> Geoffrey Carman wrote:
>
>> IDV querying if repetitive, is a bit better since it will cache the results and reuse them instead of doing new queries.
>>
>> At some level, if you know you will need a ton of stuff later in the policy, it is possible that by querying once for it all early (say that is even possible) then a single query for many attrs is more efficient than many smaller queries.

>
> If you use the query token (I think also the relevant attribute tokens), then the engine tries it's best to automatically perform this kind of optimization.
> You can often see this in a trace, that after for example the second time through a series of inefficent queries - the engine pre-queries for all the attributes (tied to that specific object class/object-name/entry-id) that it noticed were queried last time.


It is two fold. 1) As you suggest, it will guess ahead what might also
need to be queried.

2) There is also a cache, that persists the duration of this policy
object being processed of the queried data, so that you will also see a
second query occur sometimes, without going through the ITP/OTP, since
it has teh data already in final form.


0 Likes
ffreitas Frequent Contributor.
Frequent Contributor.

Re: Cost of trace example

On 3/10/2014 9:22 AM, Geoffrey Carman wrote:
> I have been screwing around with Roles in Policy for some fun ideas
> someone suggested to me...
>
> Anyway, doing a lot of reading roles, and processing them in a useful way.
>
> My test system this takes about 25 seconds to process per trace log.
>
> Turn trace off on those rules and went down to under 1 second.
>
> Wow! 25 fold improvement in performance. I knew it was expensive to
> have trace on, but that is a lot.
>
> Doing a lot of screwing around with nodesets, and Append XML element
> tokens, so sort of worst case for Trace, but still.
>
> Useful data point to consider is all I am saying. 🙂


I remember asking about this in the past and the explanation I got was
that when trace is running the IDM engine will synchronously write to
the file.

That means instead of CPU and memory latency, we will have to deal with
disk latency as the key bottleneck. This is done on purpose to avoid
lose trace information, since trace is meant for
debugging/troubleshooting only.

With queries/large nodesets you simply have more to write to disk,
increasing the impact of the disk subsystem performance on your overall
IDM performance.

I have personally seen in very large systems/multiple drivers slowdowns
of about 120x with trace level 5 enabled in all 41 drivers running in
that server.

Cheers,

-Fernando
0 Likes
Anonymous_User Absent Member.
Absent Member.

Re: Cost of trace example

It'd be interesting to have some idea of what hurts more, XML
serialization (as Perin stated here:
https://forums.netiq.com/archive/index.php/t-5539.html ) or the I/O
bottlenecks writing crap to the actual file. First person to do the
testing using /dev/shm as the target of the IDM trace file to minimize I/O
or using the 'libeatmydata' library for the same reason, gets bragging
rights for being neato.

--
Good luck.

If you find this post helpful and are logged into the web interface,
show your appreciation and click on the star below...
0 Likes
Anonymous_User Absent Member.
Absent Member.

Re: Cost of trace example

On 3/11/2014 4:05 PM, ab wrote:
> It'd be interesting to have some idea of what hurts more, XML
> serialization (as Perin stated here:
> https://forums.netiq.com/archive/index.php/t-5539.html ) or the I/O
> bottlenecks writing crap to the actual file. First person to do the
> testing using /dev/shm as the target of the IDM trace file to minimize I/O
> or using the 'libeatmydata' library for the same reason, gets bragging
> rights for being neato.
>

How bout running it on a SSD? Isn't the latency difference so little as to be like memory?

--
-----------------------------------------------------------------------
Will Schneider
Knowledge Associate http://forums.netiq.com

If you find this post helpful, please click on the star below.
0 Likes
Knowledge Partner
Knowledge Partner

Re: Cost of trace example

On 3/13/2014 6:43 PM, Will Schneider wrote:
> On 3/11/2014 4:05 PM, ab wrote:
>> It'd be interesting to have some idea of what hurts more, XML
>> serialization (as Perin stated here:
>> https://forums.netiq.com/archive/index.php/t-5539.html ) or the I/O
>> bottlenecks writing crap to the actual file. First person to do the
>> testing using /dev/shm as the target of the IDM trace file to minimize
>> I/O
>> or using the 'libeatmydata' library for the same reason, gets bragging
>> rights for being neato.
>>

> How bout running it on a SSD? Isn't the latency difference so little as
> to be like memory?


I would love to try that, it would be interesting to see.

0 Likes
Anonymous_User Absent Member.
Absent Member.

Re: Cost of trace example

On Fri, 14 Mar 2014 01:06:59 +0000, Geoffrey Carman wrote:

> On 3/13/2014 6:43 PM, Will Schneider wrote:
>> On 3/11/2014 4:05 PM, ab wrote:
>>> It'd be interesting to have some idea of what hurts more, XML
>>> serialization (as Perin stated here:
>>> https://forums.netiq.com/archive/index.php/t-5539.html ) or the I/O
>>> bottlenecks writing crap to the actual file. First person to do the
>>> testing using /dev/shm as the target of the IDM trace file to minimize
>>> I/O
>>> or using the 'libeatmydata' library for the same reason, gets bragging
>>> rights for being neato.
>>>

>> How bout running it on a SSD? Isn't the latency difference so little
>> as to be like memory?

>
> I would love to try that, it would be interesting to see.


I am. It is. I haven't had a chance to do detailed testing yet, though.



--
--------------------------------------------------------------------------
David Gersic dgersic_@_niu.edu
Knowledge Partner http://forums.netiq.com

Please post questions in the forums. No support provided via email.
If you find this post helpful, please click on the star below.
0 Likes
Anonymous_User Absent Member.
Absent Member.

Re: Cost of trace example

Closer, sure, but RAM is usually measured in gigabytes per second, and the
fastest SSDs I've seen are still below 1 GB/s (I haven't looked lately,
but they're nowhere near RAM). Should the throughput matter? Perhaps not
as much as how quickly the OS can confirm the write is done, but I'd still
bet on RAM. Simple tests could be done on any system that has an SSD and
decent tools (aka, Linux, maybe even Mac... sorry Geoffrey, get a real OS).

Code:
--------------------
#Big write test.
time dd if=/dev/zero of=/dev/shm/1gb bs=1048576 count=1000 && sync
time dd if=/dev/zero of=/home/ssd/1gb bs=1048576 count=1000 && sync

#Lotsa little writes test; warning, be sure your filesystem has enough
#inodes to handle this without running out; 10 million files may push
#the limits of ext filesystems; use XFS or better if in doubt:
mkdir '/dev/shm/tmp'
starttime=`date +%s%N`
for i in {0..9999999}; do
touch "/dev/shm/tmp/${i}"
done
endtime=`date +%sN`
echo -n 'Time for ramdrive file creation:'
echo "${endtime}-${starttime}" | bc -l

mkdir '/home/ssd/tmp/'
starttime=`date +%s%N`
for i in {0..9999999}; do
touch "/home/ssd/tmp/${i}"
done
endtime=`date +%sN`
echo -n 'Time for SSD file creation:'
echo "${endtime}-${starttime}" | bc -l
rm -rf '/home/ssd/tmp/'
--------------------

No, I haven't tested any of that... just seeing how many errors I can
create without any testing.

--
Good luck.

If you find this post helpful and are logged into the web interface,
show your appreciation and click on the star below...
0 Likes
Anonymous_User Absent Member.
Absent Member.

Re: Cost of trace example

On 3/10/2014 10:22 AM, Geoffrey Carman wrote:
> I have been screwing around with Roles in Policy for some fun ideas someone suggested to me...
>
> Anyway, doing a lot of reading roles, and processing them in a useful way.
>
> My test system this takes about 25 seconds to process per trace log.
>
> Turn trace off on those rules and went down to under 1 second.
>
> Wow! 25 fold improvement in performance. I knew it was expensive to have trace on, but that is a lot.
>
> Doing a lot of screwing around with nodesets, and Append XML element tokens, so sort of worst case
> for Trace, but still.
>
> Useful data point to consider is all I am saying. 🙂

Didn't you actually profile this a few years ago? I seem to remember an article covering this.

--
-----------------------------------------------------------------------
Will Schneider
Knowledge Associate http://forums.netiq.com

If you find this post helpful, please click on the star below.
0 Likes
Knowledge Partner
Knowledge Partner

Re: Cost of trace example

On 3/13/2014 6:44 PM, Will Schneider wrote:
> On 3/10/2014 10:22 AM, Geoffrey Carman wrote:
>> I have been screwing around with Roles in Policy for some fun ideas
>> someone suggested to me...
>>
>> Anyway, doing a lot of reading roles, and processing them in a useful
>> way.
>>
>> My test system this takes about 25 seconds to process per trace log.
>>
>> Turn trace off on those rules and went down to under 1 second.
>>
>> Wow! 25 fold improvement in performance. I knew it was expensive to
>> have trace on, but that is a lot.
>>
>> Doing a lot of screwing around with nodesets, and Append XML element
>> tokens, so sort of worst case
>> for Trace, but still.
>>
>> Useful data point to consider is all I am saying. 🙂

> Didn't you actually profile this a few years ago? I seem to remember an
> article covering this.


Memory size of nodesets, yes, and the consequences to performance of
trace, but this was just a quick example to make the point nicely with
data points.


0 Likes
The opinions expressed above are the personal opinions of the authors, not of Micro Focus. By using this site, you accept the Terms of Use and Rules of Participation. Certain versions of content ("Material") accessible here may contain branding from Hewlett-Packard Company (now HP Inc.) and Hewlett Packard Enterprise Company. As of September 1, 2017, the Material is now offered by Micro Focus, a separately owned and operated company. Any reference to the HP and Hewlett Packard Enterprise/HPE marks is historical in nature, and the HP and Hewlett Packard Enterprise/HPE marks are the property of their respective owners.