PPM 9.3 email issue after upgrading from 9.14. Randomly sending same email 100's of times
We upgraded our clustered environment from 9.14 to 9.31 back in December. After about 3 weeks, users started complaining about receiving 100's of the same email throughout the day. We opened case with HP but no resolution has been identified. This appears to be a new issue or something they have never seen before and having hard time replicating. We are going on 2+ months and our users are starting to get frustrated. It just happened to me again today, and I have received over 300 emails from the same report. This is very random and happens with everything - reports, packages, and request emails. HP suspected blank or missing email addresses as the issue but this is not the reason. Has anyone experienced this issue with 9.3x?
As far as I know, 9.30 and 9.31 have issues with "CCI". Both versions didn't send mails to CCI recipients.
Is the mail message comes from an OOTB template, or is a customized template ?
Can you compate the send date with Service Audit Page ? Perhaps a PPM service cause this trouble (Project sync service / Staffing profile ? Resource management ?)
I'm not familiar with CCI, but we are using SMTP mail messaging.
Both out of box templates as well as custom email templates are experiencing this issue. The 100's of email reports shown in my question above are generated from OOB report template. This is checkbox notify when report completes.
I looked at the services audit page but nothing is jumping out at me. We have clustered environment. 2 app servers running 4 nodes (but everything running on app server 1), and 2 web servers.
Usually when this occurs (since its random) it will send 100's of emails but stop after 7-8 hours. I suspect one of the services is having an issue, like Notification Cleanup Service. However I'm not familiar with what each service does or how to troubleshoot. I noticed a Debug Messages Cleanup Service, but not sure what that does either.
You most likely checked this but in case you did not I will ask. Are you sure it is on the PPM side and not in your e-mail system? I ask because we had users saying they were not getting e-mails. When I looked in PPM I saw them sent without issues so I had to turn it over to the e-mail system people. Did you check the PPM e-mail tables and see if in there you see 100's of e-mails sent?
Thanks for the suggestion but its definitely PPM causing this issue. We have sent tons of information to HP but they still dont know what is causing this. When I logged into work today and checked my inbox, I had a repeating notification email from a report that completed. It sent 898 emails and is still sending. I just ran the query again and its over 1000 and climbing. Our users as well as our team is really getting frustrated with this problem and we dont know what to do. Its starting to cause problems for everyone. Especially when you have 500-1000 repeating emails and have to search & purge them from your inbox. Its wasting people's time and creates issues. I had to turn off email alert notications in Outlook because when I get repeating emails, it was constantly generating new email alert popup box.
Anyone know the details on how PPM sends emails? Is there a flag or something that PPM monitors to know if it should send a repeat notification? I'm still puzzled why it would keep sending the same email. I would think it would send 1 or at the most 3-4 since we have 4 nodes running.
select a.creation_date, a.subject, b.email_address from knta_notif_txn_parents a, knta_notif_txn_recipients b, knta_notif_txn_details c
and subject like '%Report Completed on April 4, 2016%'
and b.email_address like 'Robert%'
order by 1
I've been following this thread with interest, but it doesn't sound like you're making any traction. A couple of questions....you've probably heard them all before....
When you get 800 e-mails about the same scheduled report, is it a current notification? Or is it telling you today about a report that ran two weeks ago?
Do you see 800 rows in KNTA_NOTIF_TXN_PARENTS for the same PARENT_KEY_ID? Any rows with NOTIFICATION_SENT_FLAG = 'N'?
What interval is your Notification service set to? Any errors? Do you see a corresponding long run time for this service whenever you get the 800 e-mails?
What is your e-mail system? We're running a hybrid MS Exchange/Lostus Notes environment and I've seen repeating e-mails about meetings that are unrelated to PPM but somehow the system keeps recirculating them and the admins can't explain it.
Thanks for the reply. We are litterally pulling our hair out on this!!! I just posted an update to our case with HP stating that I have a repeating email going on now - it started this AM with 800 emails and is now at over 2,100 and climbing,
I ran the query below and all rows show "Y" on the notification_sent_flag. I haven't found any that are "N"
The email being sent is the out of box checkbox, send report when finished.
Didnt have this issue until after we upgraded.
from knta_notif_txn_parents a, knta_notif_txn_recipients b, knta_notif_txn_details c
and a.subject like '%Report Completed on April 4, 2016%'
All rows also reference the same parent_key_id = '131568'
This particular email is being generated from a report I scheduled a few months back. It is set to repeat daily at 3am. 95% of the time it runs and sends 1 email. Only 5-6 times has it kept sending repeating emails like today. It always stops after some point and never keeps going into the next day (that I have ever seen).
When this happens on requests (which we had one of those on 3/31) it sent almost 500 emails and stopped once the Request changed to Closed status. That seems to always halt request emails that are repeating. If the status is closed it wont send any more emails.
Packages also can send repeating emails, but I havent seen too many of those (since I mostly run reports and requests)
We are using SMTP email.
Notification Cleanup Service is set to 24 hours (node 4)
Notification Service is set to 20 seconds (node 4) this service does have a ! and lots of errors like
Notification Service: job id: ID:sdc01pppm06.keybank.com-38031-1459703917374-1:1:21:1:5843, entity type: null, entity Id: null, Additional info: ERROR defineColumn error: 2 0
at sun.reflect.GeneratedMethodAccessor1686.invoke(Unknown Source)
at sun.reflect.GeneratedMethodAccessor396.invoke(Unknown Source)
at sun.reflect.GeneratedMethodAccessor395.invoke(Unknown Source)
at com.sun.proxy.$Proxy141.handleMessage(Unknown Source)
at sun.reflect.GeneratedMethodAccessor392.invoke(Unknown Source)
at com.sun.proxy.$Proxy143.onMessage(Unknown Source)
Hi...couple things I'd look at if you haven't already...
So when you run that query, how many rows do you get? Do you see 800 rows in KNTA_NOTIF_TXN_PARENTS for the same PARENT_KEY_ID? If so, then PPM is somehow generating notification messages too many times.
Do you see a long run time for the notification service whenever the problem occurs? You mentioned "it started this AM with 800 emails and is now at over 2,100 and climbing"...was this one single run of the service? This can help figure out if PPM is really sending 800 e-mails. If not, perhaps the duplication is happening on the mail server side.
Do the duplicate emails themselves have the same message id in the header? Look especially at the received headers, are the hops and the IDs the same?
I ran the query and attached some of the outout below. Yes all 3263 rows reference the same parent_key_id.
I received the repeating email from 3am until 7:02pm.
I checked the message headers and confirmed they are different.
I have more repeating emails today. I just stopped and restarted the Notification Service and it didn't stop the repeating emails. 3 min afterwards I stated getting more repeating emails. I might try stopping and restarting the cleanup service.
The Notification Service and the Notification Cleanup service are using the defaults 20 seconds and 24 hours respectively.
Yeah, that's very odd. Seems like whatever is going on (running reports, workflow transitions, etc) is generating duplicate requests to send notifications if you have that many rows in there. I suppose HP has had you recompile? Maybe also restart the database if you haven't done that; I think there are some triggers in there that might be related.
We still have this issue going on and HP is unable to determine the cause. Just sending update hoping someone has run into this issue after upgrading to 9.3. We never had this happen with 9.1x. Very random. It just happened again and sent over 400 repeat emails that my daily report completed. Also sent over 500 repeat emails to our team inbox that a request had failed. This has been going on since January so its starting to frustrate everyone. It has consumed a great deal of time trying to troubleshoot as well as having to delete the emails.