Author Message
muhammadabrar2
Joined: Oct 13, 2008
Messages: 7
Offline
I have two applications running on single machine. One is Tsapi Exerciser and the other one is our own application. We have observed round about 6 disconnections on our code - universal Failure with Stream Failed while there was no any disconnection on Tsapi Exerciser.
Our application is sending large number of requests to AES contrary to exerciser. What could be the reason behind these universal failures(disconnections)?
JohnBiggs
Joined: Jun 20, 2005
Messages: 1141
Location: Rural, Virginia
Offline
AE Services is only but so patient.

Events and Confirmations coming from AE Services headed for the application are sent via TCP/IP into the client DLL, The client DLL has some queue space to store some number of them, but the client DLL expects the application pull them out quickly. If AE Services has a message to send to the application, and it sends it to the application/client DLL, and does not get an acknowledgement (the client DLL will not ack a message if there is not buffer space to store it), AE Services will make a total of 5 attempts before giving up and closing the stream.

Most apps that poll for events have this problem under load. It is better to setup a thread to react to events arriving in the queue, and then receive from the queue till it empties.

Another common mistake is to not have a queue receive thread and a worker thread. If you combine these two tasks so that you receive an event and process it completion, events that take a long time to process in the app, cause a long delay before the next receive from the client DLL, allowing the queue to grow, causing problem #1.

I encourage you to read ALL text in

Avaya MultiVantage®
Application Enablement Services
TSAPI for Avaya Communication Manager
Programmer's Reference
02-300544
Release 4.2
May 2008
Issue 4


for discussions involving acsGetEventPoll() and acsGetEventBlock()
muhammadabrar2
Joined: Oct 13, 2008
Messages: 7
Offline
We are using acsGetEventBlock. We have totally separate thread for reading msg from the queue and placing it into the application queue so it would not be possible for a msg to wait in the queue.

JohnBiggs
Joined: Jun 20, 2005
Messages: 1141
Location: Rural, Virginia
Offline
Have you enabled server side, and client side tracing (TSSPY) and taken a look at what it is telling you for the stream failure cause? Taken a packet sniff to see where the clearing is starting from?
muhammadabrar2
Joined: Oct 13, 2008
Messages: 7
Offline
No, I have not!!!
Monitoring 600 stations along with querying 50agents per second!!!
What would you suggest about running TSSPY under heavy load in real production environment? I mean is it safe? Will tsapi lib be able to handle this much load along with TSSPY?
As per my knowledge tsapi has built-in resilience to bear small fluctuation in network; Even for 5/10 seconds under normal load. In such circumstances what will happen if I contine to send msgs/requests with my normal rate.
I have observed one thing. Every time I got this disconnection, events were stopped coming 2-3 seconds before stream failure. During this time my application continued to sending request to tsapi lib and getting proper invoke id from lib. So when I got stream failure, there two hundreds request waiting with proper invoke id for response from AES.
JohnBiggs
Joined: Jun 20, 2005
Messages: 1141
Location: Rural, Virginia
Offline
1) it is a field site, you should be consulting with Avaya Global Services thorough a maintenance contract. DevConnect's charter is here to help developers develop apps, not troubleshoot field issues.

2) You are not going to learn much if you don't enable some tracing. TSSPY and AE Services can keep up with the tracing work load. At least as well as it currently is. A packet stream trace (e.g wireshark) to identify the source of the socket close would not impact either server at all. AE services is tested at 1000 msgs/second. At best you are probably 1/10 of that, probably less. There is lots of headroom for tracing on the AE Service side. I don't know what your server is capable of keeping up with...

3) "As per my knowledge tsapi has built-in resilience to bear small fluctuation in network; Even for 5/10 seconds under normal load. In such circumstances what will happen if I contine to send msgs/requests with my normal rate" I don't know what you are referring to. The only 'resilience' is the TCP retries... if the load is high enough and a backlog occurs due to the inability to move traffic across the network from AE Services to the client DLL, bad things happen fairly quickly.

So what you are seeing is you can are stuck receiving an ACK to a request transmitted toward AE Services, you don't know if it is getting the request, nor do you know what is happening with the responses from AE Services.... trace it, or sniff it, you need more data to move forward. The traffic from the Exerciser is probably only the handshakes, (unless you established 600 monitors through it, or have created a script to do the same polling you are doing from the app) so a network glitch could easily be overlooked by the Exerciser depending on what you are doing with it.
muhammadabrar2
Joined: Oct 13, 2008
Messages: 7
Offline
Thanks a lot for your detailed answer.
I appreciate your help. you have always been very helpful.
I'll surely get TSSPY activated and then probably wireshark as well.
JohnBiggs
Joined: Jun 20, 2005
Messages: 1141
Location: Rural, Virginia
Offline
one other thought comes to mind... log into AE Services and go to /opt/mvap/logs and make sure there are no crash* files. Although my expectation is that they would impact your Exerciser trace, one can never be certain.

e.g.

crash.TSAPIMonitor.21350.tar.gz

Go to:   
Mobile view