Performance recommendations for eDirectory, SuSE and Azure

1 Likes
3 months ago

Introduction

One of the most common situations you face when working with an eDirectory solution that  needs your objects to be constantly updated is:

  • Write-to-disk performance.

 

Write speed is paramount so that LDAP, LDAPS, and/or Replica services are not affected, therefore in this document it is about the type of disk and array that should be used in order not to affect those services.

 

The following documento is based on an installation with the following products:

  • eDirectory 9.9.2
  • Identity Manager 4.8.2
  • SuSE SLES 15 SP2 operating system
    • Filesystem type XFS
  • Microsoft Azure cloud platform
    • Standard DS4 v2 (8 vcpus, 32 GB memory)
    • Application-only P30 type disc

 

To learn more about  possible eDirectory, please refer to the following  document.

 

To know but  of  filesystem types used for  eDirectory:

 

This document does not explain  how to use the Azure platform or its implications to learn more about it please review the documentation provided by:

 

 

 

 

 

 

 

 

 

Situation andtests performed

Description of the situation

During critical periods of information update, LDAP, LDAPS, and Replica services have been found to be affected or suspended as disk utilization is 100%.

 

This causes authentications to be stopped for several seconds or to fail by timeout. That causes the end user's displeasure.

 

To resolve this issue, the amount of IOPS (Input/Output Operations Per Second) isreviewed using the tools recommended by Microsoft:

But these prove that the disks if they give the speeds mentioned in the table

 

Disk Type

Size

IOPS

Speed

P10

128DiB

500 (3,500)

100 MB/second (170 MB/second)

P30

1 TiB

5,000 (30.00)

200 MB/second (1,000 MB/second)

 

This is because Azure disks are optimized for multiple write threads, but are not optimized for a single write thread.

 

To make the test as close to the write method as possible, we recommend that you use the command:

  • dd if=/dev/zero of=/filesystemtotest/iotest.log bs=64k count=8k conv=fdatasync

 

To learn more about testing disk types on cloud platforms

 

Proposed solution

The proposed solution is to use 5 P10 disks in a stripped-type  array with Azure caches on write and read.

 

This solution has already been tested and documented in the following article:

 

Creating the fix

The following information is taken from the previous  link.

 

Having obtained additional disks, we will proceed with the partitioning and creation of the PV and VG. These first steps will be the traditional form, paying special attention to the size of the Physical Extension, which by default is 4 MB. To create the partition, I have used parted, but it may be fdisk that I consider safer to use, if using parted I recommend extreme precautions:

  • parted -s /dev/sdX u % mklabel msdos mkpart primary 0 100

 

NOTE:  Run it as many times as disks that we will include in the new array.

 

Later you need to configure the new partitions as PV:

  • pvcreate /dev/sdA1 /dev/sdB1 /dev/sdX1 /dev/sdW1 /dev/sdZ1

 

Where: X, W and Z correspond to the different discs to be included in the VG.



Now it's time to create the VG, to achieve it just run:

  • vgcreate <vg-name> /dev/sdA1 /dev/sdB1 /dev/sdX1 /dev/sdW1 /dev/sdZ1

 

Where: <vg-name> will be the name of the volume group that is deemed appropriate.



Once this point is reached we will proceed to create the LV with stripe mapping, this is the key point for our arrangement, as it will run:

  • lvcreate -i5 -I4MB -l 100%GRATIS -n <lv-name> <vg-name>

 

Where:
-i 5: Indicates the number of stripes (PVs) in which the information will be stored, which depends on the PVs that are part of the VG.
  -I4MB: Defines the size of each stripe/fragment of information, which will be stored sequentially in each PV, here it is important to mention that the maximum size of each fragment cannot exceed the SIZE of the PE (default 4 MB). If not set, the default size will be 64KB for each shard. In the actual case, the maximum allowed by the PE configuration was configured since the DIB is quite large. . Using a tag notation that is conveniently more appropriate and convenient for this case. .

 

Finally, the LV will be checked with the stripe assignment, so it will run:

  • lvdisplay -m /dev/<vg-name>/<lv-name>



At this point, we have successfully created an LV in Stripe mode!!

 

XFS creation

Another recommendation of eDirectory documentation is the type of file system, in previous versions was ReiserFS, however, for the present and the future this is increasingly deprecated in favor of more efficient solutions, one of them is XFS, which is compatible with eDirectory 9.x, as in SLES 15.

This file system is optimized for large files, has parallelization properties, as well as being a journaling file system, supports hot size so it is ideal for use with LVM, is a high-performance file system and has its own set utilities.

 

Here the only point to consider here is the size of the block used in the eDirectory FLAIM engine, this by default uses 4KB blocks, however, when configuring it for the first time it can be customized to 8KB, so the block size used by FLAIM and the one configured on the file system must be the same.

A good news is that the default size of each block in XFS is also 4KB, so if you haven't changed this parameter at the time you configure your eDirectory instance, no additional configuration is required to create the file system.

 

Just run it directly:

  • xfs /dev/<vg-name>/<lv-name>



At this point we already have our LV in Stripe mode with XFS ready for use!!!


MIGRATION TO THE NEW LOGICAL VOLUME.

For this step we must take extreme precautions as incorrect execution can seriously and irreparably affect the IUD.

 

To migrate the DIB from one LV to another, I used the rsync command that performs the necessary checks to ensure the integrity of the files, however, it is possible to use other utilities, since the bravest is the cp command.

 

Procedure:

 

  1. Stopping the eDirectory service, this is mandatory because the copy must be done with the stopped service to prevent the DIB from undergoing modifications during the process, as it runs:
    • ndsmanage

 

  1. Mount the new file system, this can be mounted on, or under /mnt:
    • montar /dev/<vg-name>/<lv-name> /mnt




 

  1. Copy files located on the above file system, used to store original DIB files, in our case we had everything in an LV mounted under /edirectory, so we can run it:
  • rsync -avz /edirectory/ /mnt

NOTE:  It is important to place a diagonal at the end of the source directory name, as this will cause rsync to copy the contents of the directory and write it to the new location, if this diagonal is omitted from the directory and its contents will be copied to the destination.

  1. Confirm the integrity dib files: To do this you can run an md5 or sha256 checksum recursively from the original DIB and then compare it to the DIB that was copied to the new file system., this must match.

 

 

  1. Unmount both file systems; where the copied DIB is now and old where the original DIB is.

 

  1. Mount the new file system to the mount point of the old file system:

mount /dev/<vg-name>/<lv-name> /PATH/TO/EDIR/DIB

 

  1. Verify that all of the above steps have run successfully.

 

  1. Start the eDirectory service again, once started you will see typical errors resulting from the unavailability of a replica (625, 626), however, these should disappear as the minutes pass.
  • At the same time, we recommend that you validate the messages returned in the ndsd.log.
  • Run an ndstrace at the DEBUG level to identify possible further errors.
  • Validate the opening of ports 389, 636, 524, after execution queries are opened to the service.

 

NOTE1: Do not delete the original DIB until you confirm that the service is running stable and error-free.
NOTE2: It is important to note that the more time passes, the less useful the original DIB will be, because it will be outdated from the rest of the replicas in the ring.

 

Update the mount point reference for the eDirectory file system in /etc/fstab, taking care to place the correct reference and file system type.

 

 

 

Tests performed.

For the tests that were performed:

  • Write tests using the command
    • The command was used:
      • dd if=/dev/zero of=/filesystemtotest/iotest.log bs=64k count=8k conv=fdatasync
    • Write tests using directory objects
      • 50,000 objects were modified only by changing the description attribute.
      • The write/read Azure cache is left on.
      • Three control tests were performed only using a single p10-type disk
      • Three control tests were performed using the 5-disk p10 array.

 

Control test

Write test dd command:

 

/home/novell # dd if=/dev/zero of=/mnt/iotest.log bs=64k count=8k conv=fdatasync

8192+0 records in

8192+0 records out

536870912 bytes (537 MB, 512 MiB) copied, 5.685 s, 94.4 MB/s

 

Proof of writing objects to the directory

 

50,000, 1 modified attribute 1 authentication per second, 1 P10, azure cache read on and write on

Duration of freezing in seconds

Total execution time mm: ss

28, 12,*

8m31s

26, 13, *

8m24s

24, 15, *

8m26s

 

 

 

 

Stripped test

Write test dd command:

 

/home/novell # dd if=/dev/zero of=/nds/iotest.log bs=64k count=8k conv=fdatasync

8192+0 records in

8192+0 records out

536870912 bytes (537 MB, 512 MiB) copied, 0.429201 s, 1.3 GB/s

 

 

50,000, 1 modified attribute. 5 disk arrange, 1 authentication per second P10, azure chace read on and write on

 

Duration of freezing in seconds

Total execution time mm: ss

1.546s

3m21.752s

1.604s

2m40.691s

1.425s

2m55.051s

Conclusion

Using the stripped array s in conjunction with the caches on shows an undeniable improvement. But this improvement should always be accompanied by backups via  dsbk  or azure images.

 

It is always important to validate that Azure caches are turned on.

 

Labels:

How To-Best Practice
Support Tip
Comment List
Anonymous
  • Update, the host caching on azure is inconsistent, over a restart we lose the write speed, now I recommend using 

    Standard D32s v3 (32 vcpus, 128 GiB memory) with IO  768 MB/s or Standard D48s v3 (48 vcpus, 192 GiB memory), IO 1.2 GB/s

    Also is better to use gpt as a table partition parted -s /dev/sdf u % mklabel gpt mkpart primary 0 100, and for the disk is recommended is no use caching and select P20 or P30 in a 5 striped array disk. 

    With P20 disks the average speed is 530 MB/s

    With P30 disks the average speed is 650 MB/s

    Thank you

Related Discussions
Recommended