Author Message
BrettHughes
Joined: Oct 10, 2011
Messages: 8
Offline
We have a set of EMC XML Servers we're using to monitor phones and we're getting TServerNotActive errors every couple days causing the EMC servers to lose connection to AES. Is there any logging we can setup in AES to troubleshoot why its losing connection?
JohnBiggs
Joined: Jun 20, 2005
Messages: 968
Location: Thornton, CO
Offline
you can enable tracing
https://www.devconnectprogram.com/site/global/products_resources/avaya_aura_application_enablement_services/support/faq/tsapi/index.gsp#10

I would also look at the /opt/mvap/logs/log.YYYYMMDD file (first).

However I suspect you learn perhaps as much by doing a packet capture. Most likely the AES is not seeing the handshake from the client, just like the client isn't getting a response from the server and what you are really looking for is an unreliable network that is dropping packet traffic occasionally.
BrettHughes
Joined: Oct 10, 2011
Messages: 8
Offline
Thanks I will try turning on tracing. I checked the log.YYYYMMDD file and it only has 3 entries for the entire day, none around the time it dropped connection. I looked at some of the previous days and most files have very few entries, below is the most common one.

08:27:18.060 ERROR:WARNING:TSAPI:terminates:Unexpected termination for primitive 61
BrettHughes
Joined: Oct 10, 2011
Messages: 8
Offline
I found this info in the logs on one of the servers that blipped on 8/15 if this helps narrow it down

EMC XML log
Aug 15 04:06:59 EMC-XML-10 AXMLServer[1868]: -05:00 2019 901 1 .cce | 0 Major [3472] CTID3472 caught ACSUnSolicitedClass(error 0:StreamFailed) on acsHandle 18089152 and 1 licenses (clients) were released.


AES log.log
04:03:41.753 ERROR:WARNING:trans_serv:SetStatus:Connection state for link 172.23.70.25 on switch swlink20 changed to TCP Down(207)
04:03:41.753 ERROR:WARNING:trans_serv:PreserveSession:Session, 5ce51740, for switch, swlink20, has been preserved.
04:03:42.702 ERROR:WARNING:trans_serv:SetStatus:Connection state for link 172.23.70.25 on switch swlink20 changed to Restart(206)
04:03:42.710 ERROR:WARNING:trans_serv:processConnAcceptData:Preserved session, 5ce51740, for switch, swlink20, MID, 1, has been restored.
04:03:42.710 ERROR:FYI:trans_serv:SetStatus:Connection state for link 172.23.70.25 on switch swlink20 changed to Talking(202)
04:04:54.179 ERROR:WARNING:trans_serv:ReadMsg:Error notification received switch: swlink20 dest: 172.23.70.25 cause: 5 data: <none>
04:04:54.179 ERROR:WARNING:trans_serv:SetStatus:Connection state for link 172.23.70.25 on switch swlink20 changed to No Reply To HB(5)
04:04:54.179 ERROR:WARNING:trans_serv:PreserveSession:Session, 5ce51740, for switch, swlink20, has been preserved.
04:04:56.898 ERROR:WARNING:trans_serv:SetStatus:Connection state for link 172.23.70.25 on switch swlink20 changed to Restart(206)
04:04:57.228 ERROR:WARNING:trans_serv:processConnAcceptData:Preserved session, 5ce51740, for switch, swlink20, MID, 1, has been restored.
04:04:57.230 ERROR:FYI:trans_serv:SetStatus:Connection state for link 172.23.70.25 on switch swlink20 changed to Talking(202)
04:05:14.850 ERROR:WARNING:trans_serv:SetStatus:Connection state for link 172.23.70.25 on switch swlink20 changed to TCP Down(207)
04:05:14.850 ERROR:WARNING:trans_serv:PreserveSession:Session, 5ce51740, for switch, swlink20, has been preserved.
04:05:15.720 ERROR:WARNING:trans_serv:SetStatus:Connection state for link 172.23.70.25 on switch swlink20 changed to Restart(206)
04:05:15.768 ERROR:WARNING:trans_serv:processConnAcceptData:Preserved session, 5ce51740, for switch, swlink20, MID, 1, has been restored.
04:05:15.768 ERROR:FYI:trans_serv:SetStatus:Connection state for link 172.23.70.25 on switch swlink20 changed to Talking(202)
04:05:29.522 ERROR:WARNING:trans_serv:SendMsg:SendMsg() failed Error= 5 switch= swlink20 IP= 172.23.70.25
04:05:30.134 ERROR:WARNING:trans_serv:SetStatus:Connection state for link 172.23.70.25 on switch swlink20 changed to TCP Down(207)
04:05:30.135 ERROR:WARNING:trans_serv:PreserveSession:Session, 5ce51740, for switch, swlink20, has been preserved.
04:05:30.819 ERROR:WARNING:trans_serv:SetStatus:Connection state for link 172.23.70.25 on switch swlink20 changed to Restart(206)
04:05:30.889 ERROR:WARNING:trans_serv:processConnAcceptData:Preserved session, 5ce51740, for switch, swlink20, MID, 1, has been restored.
04:05:31.075 ERROR:FYI:trans_serv:SetStatus:Connection state for link 172.23.70.25 on switch swlink20 changed to Talking(202)
04:05:38.155 ERROR:WARNING:trans_serv:SetStatus:Connection state for link 172.23.70.25 on switch swlink20 changed to TCP Down(207)
04:05:38.155 ERROR:WARNING:trans_serv:PreserveSession:Session, 5ce51740, for switch, swlink20, has been preserved.
04:05:39.119 ERROR:WARNING:trans_serv:SetStatus:Connection state for link 172.23.70.25 on switch swlink20 changed to Restart(206)
04:05:39.144 ERROR:WARNING:trans_serv:processConnAcceptData:Preserved session, 5ce51740, for switch, swlink20, MID, 1, has been restored.
04:05:39.144 ERROR:FYI:trans_serv:SetStatus:Connection state for link 172.23.70.25 on switch swlink20 changed to Talking(202)


AES sec.log
04:05:47.258 SECURITY:FYI:TSAPI:ClientHandler::streamClosed():user emcxmluser10 with IP Address 172.23.73.40 disconnected from driver service AVAYA#SWLINK20#CSTA-S#EMC-AES-10
04:05:47.524 SECURITY:FYI:TSAPI:ClientHandler::streamClosed():user emcicruser10 with IP Address 172.23.73.40 disconnected from driver service AVAYA#SWLINK20#CSTA-S#EMC-AES-10
04:06:05.762 SECURITY:FYI:TSAPI:tsviAuthenticateSession:user emcxmluser10 logged in to AVAYA#SWLINK20#CSTA-S#EMC-AES-10 from 172.23.73.40 successfully
04:06:09.743 SECURITY:FYI:TSAPI:tsviAuthenticateSession:user emcicruser10 logged in to AVAYA#SWLINK20#CSTA-S#EMC-AES-10 from 172.23.73.40 successfully
JohnBiggs
Joined: Jun 20, 2005
Messages: 968
Location: Thornton, CO
Offline
The AES log.log events indicate that the link between AES and CM dropped. Given the timestamp of 4 AM my guess is somebody was doing some network maintenance (rebooting IP switches). That by itself would not impact the link to your application server UNLESS, the maintenance also took down that network path which seems somewhat likely given within a few minutes the link to your application also is disrupted.

I suggest enabling the tracing I mentioned and monitoring for the next Tserver unreachable event and then re-examining the AES logs. I would also be asking the networking team if they do maintenance on the network (rebooting IP switches), mentally noting that the time of day on the equipment may not correlate directly to the timezone that the networking team works out of, so their 2AM may be the equipment's 4AM.
BrettHughes
Joined: Oct 10, 2011
Messages: 8
Offline
Ok thanks, we will try turning on AES tracing to see if it gives more info. So far the only maintenance we've found running is backups running during about half of the instances of the past month where this has happened.
JohnBiggs
Joined: Jun 20, 2005
Messages: 968
Location: Thornton, CO
Offline
Network maintenance is the most likely culprit, but an IP switch can sometimes reboot for other reasons (unstable power, or crashing). Network events also happen due to overloaded network switches causing lost packets and slow downs that impact transmission reliability. I am sure you know all that, but in our lab environments we have links that are connected for many months without issue, the norm is to track these problems down to something disruptive in the network connectivity, or problems with the physical NICs in the server/application.
Go to:   
Mobile view