Signal handling in VisiBroker for C++ application (Part One)

Signal handling in VisiBroker for C++ application (Part One)

[[wiki:Traps and Pitfalls in VisiBroker Application Development|Back]]

 

Background

VisiBroker for C++ product on the Oracle Solaris, HP-UX, IBM AIX Unixes, and Redhat/Suse Linux platforms are a multi-threaded CORBA library implementation based on the POSIX compliant multi-threading APIs and the OMG CORBA specification. It also uses the signals SIGINT and SIGTERM for performing ORB shutdown and cleanup, followed by exiting the process.

There are some legacy single-threaded applications utilize the signals to process events asynchronously or forked multiple child processes for concurrency. These designs predate the availability of the multi-threading features introduced in a much later Unix/Linux offering. When such legacy applications evolved to include multi-threading capabilities or the use of third-party multi-threaded libraries without reviewing the existing signaling design could arise with undefined runtime behaviour depending on the chance encountering.

Using the traditional Unix/Linux signals to handle asynchronous event in a multi-threaded application can be messy if one is not careful or fully understand the implication of its usage. We will be going through the discussion of the traditional signal handling approach that typically employed in the single-threaded application and why its usage in a multi-threaded environment is no longer appropriate. Next, we will be illustrating the preferred approach of signal handling design to use in the multi-threaded application.

The Traditional Signal Handling Approach

First, let us examine in details on the issues when employing the traditional signal handling approach in a multi-threaded VisiBroker application. Following is the illustration of a user-defined signal handling function to trap the signal “SIGTERM” to implement its cleanup process before shutting down the VisiBroker ORB. Two potential issues can be seen in this sample code.

 

Code sample of traditional signal handling approach in a single-threaded process:

#include <signal.h>
#include <iostream>
#include "corba.h"

using namespace std;

CORBA::ORB_var orb;

extern "C" {
  void signal_handler (int sig)
  {
    cout << "signal_handler caught " << sig << endl;
    cout << " cleaning up begins..." << endl;
    // User cleanup code here.
    cout << " cleaning up ends..." << endl;
    cout << " shutting down ORB..." << endl;
    orb->shutdown(1UL);
    cout << " ORB shut down." << endl;
  }
}

int main (int argc, char** argv)
{
#ifdef USE_SIMPLE_SIGNAL
  signal(SIGTERM, &signal_handler);
#else
  struct sigaction act;
  memset (&act, '\0', sizeof(act));
  act.sa_handler = signal_handler;
  sigemptyset(&act.sa_mask);
  act.sa_flags = 0;
  sigaction(SIGTERM, &act, NULL);
#endif

  try {
    // Initialize the ORB.
    orb = CORBA::ORB_init(argc, argv);

    // get a reference to the root POA
    CORBA::Object_var obj = orb->resolve_initial_references("RootPOA");
    PortableServer::POA_var rootPOA = PortableServer::POA::_narrow(obj);

    CORBA::PolicyList policies;
policies.length(1);
    policies[(CORBA::ULong)0] = rootPOA->create_lifespan_policy(
                                                PortableServer::PERSISTENT);

    // get the POA Manager
    PortableServer::POAManager_var poa_manager = rootPOA->the_POAManager();

    // Create myPOA with the right policies
    PortableServer::POA_var myPOA = rootPOA->create_POA("bank_agent_poa",
                                                        poa_manager,
                                                        policies);

    // Activate the POA Manager
    poa_manager->activate();

    cout << "Server is up and running" << endl;
    // Wait for incoming requests
    orb->run();
  }
  catch(const CORBA::Exception& e) {
    cerr << e << endl;
    return 1;
  }
  cout << "Server is exiting..." << endl;
  return 0;
}

 

Sample process run output:

$ ./server &
[1]     5978
Server is up and running

$ kill -s term 5978
signal_handler caught 15
 cleaning up begins...
 cleaning up ends...
 shutting down ORB...

 

The first issue is that when an interested signal is delivered to a multi-threaded running process, it would randomly send to one of the user-created application threads or the VisiBroker worker threads. The thread may get terminated if the signal is not mask for every new thread created. Or, if that particular thread is not expecting to be interrupted and is executing on a critical task, say a database update operation or locking on a shared resource, the application could goes into either data inconsistency or deadlock situation.

VisiBroker solved this issue by setting up signal mask using pthread_sigmask() API to block the interested signals, SIGINT and SIGTERM, and having a separate signal handling thread using sigwait() API to wait on the signals. This design prevents the potential premature termination or interruption of the worker threads that are dispatched to service the incoming requests. This approach could utilize at the user application layer for a consistent and predictable handling of the signals logic. We will defer the discussion in the next section that is covered in the second part of the article title.

The second issue is the use of blocking APIs called within the traditional signal handling function body. Here, the “ORB::shutdown()” is called. This logic would cause the application process to suspend indefinitely. The reason is deadlock may occurred if “shutdown()” is called inside the signal handling function. The “shutdown()” requires some shared resource to be available before proceed to actual shutting down. The shared resource may still be held by some of the worker threads that are currently servicing the incoming requests. The shared resource is coordinated through the use of mutex condition variable. Therefore, it will wait for the shared resource to be free up first before proceeding to shutting down.

The mechanism on how the signal handling function is triggered is through the Operating Systems kernel pre-empting the running process via interrupt calls to deliver the signal asynchronously. The kernel momentarily suspends all the user mode running threads and delivered the signal to the signal handling function. At that point in time, a VisiBroker worker thread could be processing a client request and locking the shared resource. Before the request could be completed or the shared resource unlocked, it is interrupted by the kernel and suspend its current execution. The kernel then called the user application signal handling function.

The following is a sample call stack of the signal handling captured when the process appeared to be hung via pstack command on the Solaris platform. You would notice that the running “main” thread has suspended by the kernel in order to deliver the interested signal to the handling function “signal_handler”.

 

$ pstack 5978
5978:   ./server
-----------------  lwp# 1 / thread# 1  --------------------
 fe8bc4a0 lwp_park (0, 0, 0)
 fe8b676c cond_wait_queue (860b0, 86090, 0, 0, 0, 0) + 28
 fe8b6cec cond_wait (860b0, 86090, 0, fe8e8bc0, fe4f4000, 1) + 10
 fe8b6d28 pthread_cond_wait (860b0, 86090, fe8e8bc0, 1000, ff362000, 36a7c) + 8
 febbf644 __1cMVISConditionEwait6MrnIVISMutex__v_ (860a8, 86088, 1, 2, 14c00, 1) + 28
 fee9df90 __1cSVISProtocolManagerIshutdown6M_v_ (857f0, 3, fef95170, ff1b9b38, 86088, 85d98) + 17c
 fee252c8 __1cGVISORBbB_shutdown_orb_functionality6M_v_ (6abf8, 0, 0, ff17e7f0, 0, ff193340) + 344
 fee2d5dc __1cRVISIIOPORBFactoryNshutdown_orbs6M_v_ (ff193298, 1798, 194, ff17e7f0, 0, ff193318) + a0
 feeca3f8 __1cKVISManagerR_complete_cleanup6M_v_ (46d08, 738, 78, ff17e7f0, ff2527c4, ff12621f) + 2dc
 feeca000 __1cKVISManagerHcleanup6MCC_v_ (6cbf0, ffbff3a0, 0, ff17e7f0, 738, 1) + 3ac
 000117d8 signal_handler (f, 0, ffbff570, fe8e9c44, fe6b816c, 0) + b0
 fe8bc52c __sighndlr (f, 0, ffbff570, 11728, 0, 0) + c
 fe8b1998 call_user_handler (f, 0, 12, 0, ff362000, ffbff570) + 3b8
 fe8bc4a4 __lwp_park (860b0, 86090, 0, 0, 0, 0) + 14
 fe8b676c cond_wait_queue (860b0, 86090, 0, 0, 0, 0) + 28
 fe8b6cec cond_wait (860b0, 86090, 0, ff3ec964, 30f28, ff3ec238) + 10
 fe8b6d28 pthread_cond_wait (860b0, 86090, 0, 1000, 0, 36a7c) + 8
 febbf644 __1cMVISConditionEwait6MrnIVISMutex__v_ (860a8, 86088, ff2527c4, 1, 400, 0) + 28
 fee9e0e8 __1cSVISProtocolManagerRwait_for_shutdown6M_v_ (857f0, 1ba0, 860a8, 86088, 0, 0) + f0
 fee2858c __1cGVISORBDrun6M_v_ (6abf8, 1000, 898, ff193278, 0, 0) + 150
 00011ad8 main     (1, ffbffcac, ffbffcb4, 22400, fe79c680, fe79c6c0) + 2b0
 000112f8 _start   (0, 0, 0, 0, 0, 0) + 108

 

Since all the running worker threads including the user-created threads in the process are now suspended by the kernel, none of worker threads could continue to process the requests and free up the shared resource. In this code sample, it is the “main” thread that has been prohibited to unlock the shared resource.

Therefore, the user signal handling function cannot return due to the “shutdown()” call waiting indefinitely for the shared resource to be available. A mutual deadlock occurred. Thus, the user process could not be reschedule and regain back the control. The application process is now hung.

Please also take note that “ORB::shutdown()” is implicitly invoked when exit() is called or returning from the “main()” function. VisiBroker for C++ registered clean up code through atexit() system function. Some user application may call exit() in their signal handling function which resulted in deadlock as explained in the previous paragraph.

In fact, any blocking CORBA invocations or systems APIs should be avoided. Either use the async-signal safe system functions such as _exit(), noticed the underscore prefix. Or, keep the logic of the signal handling function simple such as setting a counter or a condition variable so as to pass the communication to the other user thread that would perform the actual work after the signal handling function has returned.

The best practice in handling the signals in a multi-threaded application will be covered in the second part of the article [[wiki:Signal handling in VisiBroker for C++ application (Part Two)|Signal handling in VisiBroker for C++ application (Part Two) ]].

 

[[wiki:Traps and Pitfalls in VisiBroker Application Development|Back]]

Labels (1)

DISCLAIMER:

Some content on Community Tips & Information pages is not officially supported by Micro Focus. Please refer to our Terms of Use for more detail.
Top Contributors
Version history
Revision #:
2 of 2
Last update:
‎2020-03-13 21:05
Updated by:
 
The opinions expressed above are the personal opinions of the authors, not of Micro Focus. By using this site, you accept the Terms of Use and Rules of Participation. Certain versions of content ("Material") accessible here may contain branding from Hewlett-Packard Company (now HP Inc.) and Hewlett Packard Enterprise Company. As of September 1, 2017, the Material is now offered by Micro Focus, a separately owned and operated company. Any reference to the HP and Hewlett Packard Enterprise/HPE marks is historical in nature, and the HP and Hewlett Packard Enterprise/HPE marks are the property of their respective owners.