Integrating Apache SpamAssassin with GroupWise


ISSUE: Fighting Spam

Background: Apache SpamAssassin is an open-source project for filtering unsolicited email or SPAM. One way to integrate this into GroupWise is to use the GWIA third-party interface. The agent that serves that interface needs to do this:

A) Move files from the \send directory to the \third\send directory.
B) Move files from the \third\recieve directory to the \receive directory.
C) Move files from the \third\results directory to the \results directory.

I initially did this trough a shellscript, but that is not a very robust way to do it unless your are an expert bash programmer, which I am not, so i wrote a Linux daemon in C for that purpose. The daemon calls SpamAssassin via a regular system() call which is not the most efficient way to do it. There are more efficient ways to do that but that is, as they say, left as an excercise for the reader, or me time permitting.

What it does:

Moves message files and runs the received ones through SpamAssassin. Logging info is sent to syslog. By default they end up in /var/log/messages. You can see the ones specific to this one by: tail -50 /var/log/messages|grep gwsa


This is not a very polished solution as of now, but I am publishing it in case anyone else has use for it or perhaps time to improve. I will edit and amend the code based on testing and feedback.



0. Install and configure SpamAssassin

1. Put the source file in a suitable directory

2. Compile it: cc -ogwsa gwsa.c

3. Copy the resulting executable to /usr/bin/gwsa, mv gwsa /usr/bin/gwsa

4. Create a directory named third under your gwia-directory.

5. Create four directories under that: receive, send, result, store

6. Run the daemon: gwsa /media/nss/GWVOL/gw/gw6dom/wpgate/gwia/ -d

7. Check that it is running: ps -aux |grep gwsa and tail -50 /var/log/messages
8. Send a test mail

9. Check log:







2019-11-24T16:23:14.206341 02:00 pamir gwsa[22693]: RCV: processing file f6eaadd5.162, 8 2019-11-24T16:23:15.205908 02:00 pamir gwsa[22693]: process_file: opening /media/nss/GWVOL/gw/gw6dom/wpgate/gwia/third/receive/f6eaadd5.162 2019-11-24T16:23:15.206379 02:00 pamir gwsa[22693]: process_file: size is 5107 2019-11-24T16:23:15.206700 02:00 pamir gwsa[22693]: process_file: found Received: at 88533E 2019-11-24T16:23:15.207006 02:00 pamir gwsa[22693]: process_file: new length is 4853 2019-11-24T16:23:15.207308 02:00 pamir gwsa[22693]: process_file: opened temporary file /tmp/f6eaadd5.162.tmp 2019-11-24T16:23:15.207627 02:00 pamir gwsa[22693]: process_file: written 4853 bytes to temporary file 2019-11-24T16:23:15.207928 02:00 pamir gwsa[22693]: process_file: passing temporary file to SpamAssassin 2019-11-24T16:23:15.208235 02:00 pamir gwsa[22693]: process_file: system(spamassassin --cf 'rewrite_header Subject ****SPAM(_SCORE_)****' </tmp/f6eaadd5.162.tmp >/tmp/f6eaadd5.162.tmp2) 2019-11-24T16:23:19.390418 02:00 pamir gwsa[22693]: process_file: system(spamassassin --cf 'rewrite_header Subject ****SPAM(_SCORE_)****' </tmp/f6eaadd5.162.tmp >/tmp/f6eaadd5.162.tmp2) returned 0 2019-11-24T16:23:19.390883 02:00 pamir gwsa[22693]: process_file: opening temporary outfile /media/nss/GWVOL/gw/gw6dom/wpgate/gwia/receive/f6eaadd5.162 2019-11-24T16:23:19.391229 02:00 pamir gwsa[22693]: process_file: writing preamble 2019-11-24T16:23:19.391580 02:00 pamir gwsa[22693]: process_file: written 254 bytes into /media/nss/GWVOL/gw/gw6dom/wpgate/gwia/receive/f6eaadd5.162 2019-11-24T16:23:19.391894 02:00 pamir gwsa[22693]: process_file: opening processed out file /tmp/f6eaadd5.162.tmp2 2019-11-24T16:23:19.392208 02:00 pamir gwsa[22693]: process_file: processed file /tmp/f6eaadd5.162.tmp2 opened 2019-11-24T16:23:19.393016 02:00 pamir gwsa[22693]: process_file: read initial 1024 bytes 2019-11-24T16:23:19.393338 02:00 pamir gwsa[22693]: process_file: read 1024 bytes 2019-11-24T16:23:19.394488 02:00 pamir gwsa[22693]: message repeated 3 times: [ process_file: read 1024 bytes] 2019-11-24T16:23:19.394971 02:00 pamir gwsa[22693]: process_file: read 746 bytes 2019-11-24T16:23:19.395381 02:00 pamir gwsa[22693]: process_file: final outfile /media/nss/GWVOL/gw/gw6dom/wpgate/gwia/receive/f6eaadd5.162 ready 2019-11-24T16:23:19.395783 02:00 pamir gwsa[22693]: Process file returns: 1 2019-11-24T16:23:19.396214 02:00 pamir gwsa[22693]: RCV: Moving file f6eaadd5.162 2019-11-24T16:23:19.396633 02:00 pamir gwsa[22693]: RCV: /media/nss/GWVOL/gw/gw6dom/wpgate/gwia/third/receive/f6eaadd5.162 -> /media/nss/GWVOL/gw/gw6dom/wpgate/gwia/third/store/f6eaadd5.162







10. Check header of received mail:







X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on pamir X-Spam-Level: X-Spam-Status: No, score=-1.2 required=5.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,RDNS_NONE autolearn=no














// Pass GroupWise incoming files to SpamAssassin for processing // Anders Gustafsson, 2019-11-21 // // // // cc -ogwsa gwsa.c // mv gwsa /usr/bin/gwsa // /usr/bin/gwsa /media/nss/GWVOL/gw/gw6dom/wpgate/gwia/ -d // ./gwsa /media/nss/GWVOL/gw/gw6dom/wpgate/gwia/ -d -s -v // // Options: // d - run as daemom // v - Verbose // s - Store received messages in /third/store // // /usr/lib/systemd/system/gwsa.service // [Unit] // Description=GroupWise SpamAsassin Checker // // //[Service] //Type=forking //User=root //Group=root //ExecStart=/usr/bin/gwsa /media/nss/GWVOL/gw/gw6dom/wpgate/gwia/ -d -v // //[Install] // // ------------------------------------------------------------------------ // Caveats: Need to be started before GWIA // ------------------------------------------------------------------------ // The GroupWise GWIA third-party interface // // The SMTP daemon portion of the GWIA looks in the third\send directory for // email to send. It puts any email it receives into the third\receive // directory. It puts all results from both sending and receiving in the // third/results folder. So the SMTP daemon portion uses ONLY the direcories // in the \third folder. // The GWIA processing portion (the part that talks to the MTA) uses the 3 // normal folders. Any email the GWIA receives that is destined for the // internet is placed in the gwia\send folder. It checks the \receive // directory for any internet email that it has received that is to be sent on // to the MTA, and it checks the \results folder for the results of all // sending/receiving operations. // The third party application must: // A) Move files from the \send directory to the \third\send directory. // B) Move files from the \third\recieve directory to the \receive directory. // C) Move files from the \third\results directory to the \results directory. // Source: // ------------------------------------------------------------------------ // To conserve resources, this code uses the Linux inotify interface to put a // "watch" on the input directories. Note an important caveat: Any files therein // before the code is launched will not be processed. // // How to exit: kill -SIGINT <pid> // ------------------------------------------------------------------------ // #define LOG_EMERG 0 /* system is unusable */ // #define LOG_ALERT 1 /* action must be taken immediately */ // #define LOG_CRIT 2 /* critical conditions */ // #define LOG_ERR 3 /* error conditions */ // #define LOG_WARNING 4 /* warning conditions */ // #define LOG_NOTICE 5 /* normal but significant condition */ // #define LOG_INFO 6 /* informational */ // #define LOG_DEBUG 7 /* debug-level messages */ // #include <stdio.h> #include <stdlib.h> #include <string.h> #include <errno.h> #include <sys/types.h> #include <sys/inotify.h> #include <limits.h> #include <syslog.h> #include <signal.h> #include <poll.h> #include <unistd.h> #define MAX_EVENTS 1024 // Max. number of events to process at one go #define LEN_NAME 16 // Assuming that the length of the filename won't exceed 16 bytes. GW files are always 8.3 #define EVENT_SIZE ( sizeof (struct inotify_event) ) // size of one event #define BUF_LEN ( MAX_EVENTS * ( EVENT_SIZE LEN_NAME )) // buffer to store the data of event #define COPY_BUF_LEN 4096 int process_file(char *filename, char *rcv_in, char *rcv_out); static volatile int keep_running = 1; // Signal handler that simply resets a flag to cause termination void signal_handler (int signum) { //printf("\nGot a signal...\n"); syslog (LOG_NOTICE, "Got a signal.."); keep_running = 0; //exit(1); } int main( int argc, char **argv ) { int length, i = 0, wd; int fd; char buffer[BUF_LEN]; char base[PATH_MAX]; // Base path for GWIA, ie /media/nss/GWVOL/gw/gw6dom/wpgate/gwia/ char rcv_in[PATH_MAX]; // third/receive/* char rcv_out[PATH_MAX]; // receive/ char snd_in[PATH_MAX]; // send/* char snd_out[PATH_MAX]; // third/send/ char sta_in[PATH_MAX]; // third/results/* char sta_out[PATH_MAX]; // results/ char oldfile[PATH_MAX]; char newfile[PATH_MAX]; char lastfile[PATH_MAX] = ""; int rcv_wd; int snd_wd; int sta_wd; int rc; int poll_num; int daemon = 0; int verbosity = LOG_ERR; int storemsg = 0; // If we are storing messages for debugging int c; int lm; char store[PATH_MAX]; // Storage of received messages for testing. nfds_t nfds; struct pollfd fds[1]; openlog ("gwsa", LOG_CONS | LOG_PID | LOG_NDELAY, LOG_LOCAL1); syslog (LOG_NOTICE, "Program started by User %d", getuid ()); // // First argument has to be base dir to watch // strcpy(base, argv[1]); while ((c = getopt (argc, argv, "dvs")) != -1) switch (c) { case 'd': daemon = 1; break; case 'v': verbosity = 7; break; case 's': storemsg = 1; break; case '?': fprintf (stderr, "Unknown option character `\\x%x'.\n", optopt); fprintf (stderr, "Usage: gwsa <path to third> -d(aemon) -v(erbosity) level \n"); exit(EXIT_FAILURE); default: abort (); } // //if(!strcmp(argv[2],"-d")) //{ // syslog (LOG_NOTICE, "Run as daemon requested"); // daemon = 1; //} if(daemon) syslog (LOG_NOTICE, "Run as daemon requested"); syslog (LOG_NOTICE, "Verbosity is %d", verbosity); lm = setlogmask(LOG_UPTO(verbosity)); syslog (LOG_NOTICE, "selogmask was %d", lm); // signal(SIGINT, signal_handler ); // // Create paths. TODO: Make sure all paths are valid and properly formatted, ie trailing slash // if(base[(strlen(base)-1)] != '/') strcat(base,"/"); strcpy(rcv_in, base); strcat(rcv_in, "third/receive/"); strcpy(rcv_out, base); strcat(rcv_out, "receive/"); strcpy(snd_in, base); strcat(snd_in, "send/"); strcpy(snd_out, base); strcat(snd_out, "third/send/"); strcpy(sta_in, base); strcat(sta_in, "third/result/"); strcpy(sta_out, base); strcat(sta_out, "result/"); strcpy(store, base); strcat(store, "third/store/"); // syslog (LOG_NOTICE, "Receive in: %s", rcv_in); syslog (LOG_NOTICE, "Receive out: %s", rcv_out); syslog (LOG_NOTICE, "Send in: %s", snd_in); syslog (LOG_NOTICE, "Send out: %s", snd_out); syslog (LOG_NOTICE, "Reults in: %s", sta_in); syslog (LOG_NOTICE, "Reults out: %s", sta_out); // if(daemon) { syslog (LOG_NOTICE, "Going daemon", sta_out); // // Our process ID and Session ID pid_t pid, sid; // Fork off the parent process pid = fork(); if (pid < 0) { syslog (LOG_ERR, "Could not fork parent"); exit(EXIT_FAILURE); } // If we got a good PID, then we can exit the parent process. if (pid > 0) { exit(EXIT_SUCCESS); } // Change the file mode mask umask(0); // Open any logs here (if needed) // Create a new SID for the child process sid = setsid(); if (sid < 0) { syslog (LOG_ERR, "Could not create child SID"); exit(EXIT_FAILURE); } // Change the current working directory to / if ((chdir("/")) < 0) { syslog (LOG_ERR, "Could not change dir to /"); exit(EXIT_FAILURE); } // Close out the standard file descriptors close(STDIN_FILENO); close(STDOUT_FILENO); close(STDERR_FILENO); } // Initialize Inotify fd = inotify_init(); if ( fd < 0 ) { syslog (LOG_NOTICE, "Could not create initialise inotify %s", strerror(errno)); } // add watch for all input directories rcv_wd = inotify_add_watch(fd, snd_in, IN_CREATE | IN_MODIFY | IN_DELETE | IN_CLOSE); snd_wd = inotify_add_watch(fd, rcv_in, IN_CREATE | IN_MODIFY | IN_DELETE | IN_CLOSE); sta_wd = inotify_add_watch(fd, sta_in, IN_CREATE | IN_MODIFY | IN_DELETE | IN_CLOSE); if (rcv_wd == -1 || snd_wd == -1 || sta_wd == -1) { syslog (LOG_ERR, "Couldn't add watch to %s %d %d %d %s\n", base,rcv_wd, snd_wd, sta_wd, strerror(errno)); exit(EXIT_FAILURE); } else { syslog (LOG_NOTICE,"Watching:: %s\n",base); } // Prepare for polling // See: nfds = 1; // Inotify input fds[0].fd = fd; fds[0].events = POLLIN; // do it forever or signal while(keep_running) { i = 0; // poll_num = poll(fds, nfds, -1); if (poll_num == -1) { if (errno == EINTR) continue; syslog (LOG_ERR, "Polling error %s", strerror(errno)); exit(EXIT_FAILURE); } // length = read( fd, buffer, BUF_LEN ); //printf("\nLength %d\n", length); if ( length < 0 ) { syslog (LOG_ERR, "Failed to read %s", strerror(errno)); } while ( i < length ) { struct inotify_event *event = ( struct inotify_event * ) &buffer[ i ]; //if(event->mask != 40000010) //printf("Event: %X %s length %d i %d eventlength %d\n",event->mask, event->name, length, i, event->len); //sleep(2); if ( event->len ) { //printf("Event: %X %s\n",event->mask, event->name); if ( event->mask & IN_CREATE) { if (event->mask & IN_ISDIR) syslog (LOG_NOTICE, "The directory %s was Created.\n", event->name ); else syslog (LOG_NOTICE, "The file %s was Created with WD %d\n", event->name, event->wd ); } if ( event->mask & IN_MODIFY) { if (event->mask & IN_ISDIR) syslog (LOG_NOTICE, "The directory %s was modified.\n", event->name ); else syslog (LOG_NOTICE, "The file %s was modified with WD %d\n", event->name, event->wd ); } if ( event->mask & IN_DELETE) { if (event->mask & IN_ISDIR) syslog (LOG_NOTICE, "The directory %s was deleted.\n", event->name ); else syslog (LOG_NOTICE, "The file %s was deleted with WD %d\n", event->name, event->wd ); } if ((event->mask & IN_CLOSE_WRITE) && event->name[0] != 'x') { // x-files are internal and should not be moved if (event->mask & IN_ISDIR) { syslog (LOG_NOTICE, "The directory %s was closed.\n", event->name ); printf( "The directory %s was closed.\n", event->name ); } else { syslog (LOG_NOTICE, "The file %s was closed with WD %d\n", event->name, event->wd ); switch(event->wd) { case 1: // Send files, just move syslog (LOG_NOTICE, "SND: Moving file %s", event->name ); strcpy(oldfile, snd_in); strcat(oldfile, event->name); strcpy(newfile, snd_out); strcat(newfile, event->name); syslog (LOG_NOTICE, "SND: %s -> %s", oldfile, newfile ); sleep(1); // Wait one second rc = rename(oldfile, newfile); if(rc != 0) { syslog (LOG_ERR, "SND: could not move %s, %s", event->name, strerror(errno) ); perror("Could not move send file"); } break; case 2: // Receive files //printf( "RCV: processing file %s", event->name ); syslog (LOG_NOTICE, "RCV: processing file %s, %d", event->name, event->mask ); sleep(1); // Wait one second if(strcmp(lastfile,event->name)==0 && (event->mask & IN_CLOSE)) { syslog (LOG_NOTICE, "RCV: same file? %s", event->name ); } else { rc = process_file(event->name, rcv_in, rcv_out); syslog (LOG_NOTICE, "Process file returns: %d", rc ); if(rc) { if(storemsg) // If we are storing received files for testing { syslog (LOG_NOTICE, "RCV: Moving file %s", event->name ); strcpy(oldfile, rcv_in); strcat(oldfile, event->name); strcpy(newfile, store); strcat(newfile, event->name); syslog (LOG_NOTICE, "RCV: %s -> %s", oldfile, newfile ); sleep(1); // Wait one second rc = rename(oldfile, newfile); if(rc != 0) { syslog (LOG_ERR, "RCV: could not move %s, %s", event->name, strerror(errno) ); //perror("Could not move receive file"); } } else // Not storing, just delete the old message file { strcpy(oldfile, rcv_in); strcat(oldfile, event->name); unlink(oldfile); } } strcpy(lastfile,event->name); } break; case 3: // Result files, just move syslog (LOG_NOTICE, "STA: Moving file %s", event->name ); strcpy(oldfile, sta_in); strcat(oldfile, event->name); strcpy(newfile, sta_out); strcat(newfile, event->name); syslog (LOG_NOTICE, "STA: %s -> %s", oldfile, newfile ); sleep(1); // Wait one second rc = rename(oldfile, newfile); if(rc != 0) { syslog (LOG_ERR, "STA: could not move %s, %s", event->name, strerror(errno) ); //perror("Could not move results file"); } break; default: syslog (LOG_ERR, "Got an unknown WD: %d ", event-wd); break; } } } } i = EVENT_SIZE event->len; } } /* Clean up*/ //printf("cleaning up..\n"); syslog (LOG_NOTICE, "Cleaning up"); inotify_rm_watch( fd, rcv_wd ); inotify_rm_watch( fd, snd_wd ); inotify_rm_watch( fd, sta_wd ); close( fd ); exit(EXIT_SUCCESS); } // Process a GW GWIA message file. Strip the preamble and create a temporary file that is passed on to SA for processing // After processing, we put back the preambe and deposit the results in the right directory // int process_file(char *filename,char *rcv_in, char *rcv_out) { char * buffer = 0; char * copybuffer = 0; long length; // Length of file long newlength = 0; // Length of message without preamble char fn[PATH_MAX]; FILE * f; FILE * f2; FILE * f3; FILE * f4; char *p; int wlen = 0; int rc; int nbytes; int status = 0; char command[PATH_MAX]; char tmp_in[PATH_MAX]; char tmp_out[PATH_MAX]; char tmp_final[PATH_MAX]; copybuffer = malloc(COPY_BUF_LEN); strcpy(fn, rcv_in); strcat(fn, filename); syslog (LOG_NOTICE, "process_file: opening %s", fn ); f = fopen (fn, "rb"); if (f) { fseek (f, 0, SEEK_END); length = ftell (f); syslog (LOG_NOTICE, "process_file: size is %d", length ); fseek (f, 0, SEEK_SET); buffer = malloc (length); if (buffer) { fread (buffer, 1, length, f); } else { syslog (LOG_ERR, "Failed to allocate %d bytes: ", length, strerror(errno) ); } fclose (f); } else { syslog (LOG_ERR, "Could not open file: %s", strerror(errno) ); } if (buffer) { // start to process your data / extract strings here... // The real message starts sith "Received:" p=strstr(buffer,"Received:"); if(p) { syslog (LOG_NOTICE, "process_file: found Received: at %X", p ); newlength = length - (p - buffer); syslog (LOG_NOTICE, "process_file: new length is %d", newlength ); // // Create a temporary file in /tmp based on filename sprintf(tmp_in,"/tmp/%s.tmp",filename); sprintf(tmp_out,"/tmp/%s.tmp2",filename); // strcpy(tmp_final, rcv_out); strcat(tmp_final, filename); f2 = fopen (tmp_in, "wb"); if(f2) { syslog (LOG_NOTICE, "process_file: opened temporary file %s", tmp_in ); wlen = fwrite (p, 1, newlength, f2); syslog (LOG_NOTICE, "process_file: written %d bytes to temporary file", wlen ); fclose (f2); syslog (LOG_NOTICE, "process_file: passing temporary file to SpamAssassin" ); // Spawn a child to run the program.*/ sprintf(command, "spamassassin --cf 'rewrite_header Subject ****SPAM(_SCORE_)****' <%s >%s", tmp_in, tmp_out); syslog (LOG_NOTICE, "process_file: system(%s)", command); rc = system(command); sleep(1); syslog (LOG_NOTICE, "process_file: system(%s) returned %d", command, rc ); // OK. If SA succeded then we have a processed file tmp_out // Open the final file. Write preamble and then the out file syslog (LOG_NOTICE, "process_file: opening temporary outfile %s", tmp_final ); f4 = fopen (tmp_final, "wb"); if(f4) { syslog (LOG_NOTICE, "process_file: writing preamble" ); rc = fwrite (buffer, 1, (p - buffer), f4); syslog (LOG_NOTICE, "process_file: written %d bytes into %s", rc, tmp_final ); syslog (LOG_NOTICE, "process_file: opening processed out file %s", tmp_out ); f3 = fopen (tmp_out, "rb"); if(f3) { syslog (LOG_NOTICE, "process_file: processed file %s opened", tmp_out ); // nbytes = fread(copybuffer, 1, COPY_BUF_LEN, f3); // See define above if(nbytes<0) syslog (LOG_ERR, "process_file: read error %s", strerror(errno) ); syslog (LOG_NOTICE, "process_file: read initial %d bytes", nbytes ); while(nbytes > 0) { syslog (LOG_NOTICE, "process_file: read %d bytes", nbytes ); if(fwrite(copybuffer, 1, nbytes, f4) != nbytes) { syslog (LOG_ERR, "process_file: could not write outfile %s", strerror(errno) ); //exit(3); } nbytes = fread(copybuffer, 1, COPY_BUF_LEN, f3); } fclose(f3); // // f4 now holds the file to pass back to GW status = 1; } fclose(f4); syslog (LOG_NOTICE, "process_file: final outfile %s ready", tmp_final ); // unlink(tmp_in); unlink(tmp_out); } } } } free(buffer); free(copybuffer); //printf("\nProcess_file exit\n"); return (1); }







Needs to be started before GWIA

Run as service:

Create a file /usr/lib/systemd/system/gwsa.service containing:




[Unit] Description=GroupWise SpamAsassin Checker [Service] Type=forking User=root Group=root ExecStart=/usr/bin/gwsa /media/nss/GWVOL/gw/gw6dom/wpgate/gwia/ -d -v [Install]




Then just :  service gwsa start



A makefile for "make" and "make install"

Selectable verbosity logging


How To-Best Practice
Comment List