Author Message
Kenji
Joined: Jan 22, 2015
Messages: 105
Offline
Hi Team,

We performed the AES connector failover test as below.
In this test case, we connected two AESs without any HA configuration.
Therefore both AESs are working as active.
- Case1 : Link down(LAN cable disconnect for Main AES) on Main AES side connection. Then we expected to do failover from Main AES to Sub AES.
- Case2 : Link down(LAN cable disconnect for Sub AES) on Sub AES side connection. Then we expected no effects to Main AES connection and AES connector operation.

For Case1, the Failover function by AES connector worked without any problem.
It is estimated phenomenon(Good Results).

For Case2, AES connector didn't work about 60 sec after detected Linkdown.
We didn't estimate this phenomenon.
Is this normal operation as AES connector?

In this case, when we maked call before Linkdown with IVR, this call(session ID = 12388) was disconnected 40 sec timeout later based on the OD application log
---------------------------------------------------------------
> fsutap10_trace.log

02/03/2017
20:32:25 Established Call
(MPP port 89914, ApplicationServer 10.140.33.71)
20:32:59 Sub AES LAN cable disconnected
20:33:43 issued CTI disconnect command?AES Connector didn't reply to this comamnd)

20:33:43:496 DEBUG - getObserversFromCallId: gathering observers... done waiting on sync list size of list is
20:34:08 AES Connector detect Sub AES link-down
20:34:08:881 DEBUG - ProviderMonitor:AES-EP-B got an exception getting date/time attempt:0 error:no conf event
--------------------------------------------------------------------------------
> fsutap10_eMB_IVR_Member.trace.log

02/03/2017 20:33:43 INFO - ESjNJMoZnWHjMfVX-U463v2:/eMB_IVR_Member : Using SCESession ESjNJMoZnWHjMfVX-U463v2:/eMB_IVR_Member servlet : M_End-M_Disconnect
02/03/2017 20:33:43 DEBUG - ESjNJMoZnWHjMfVX-U463v2:/eMB_IVR_Member : CTICommand.execute: encrypted value is XHLGU0OPUf53ZmNdsPGOdh7bKYbJZNsA
02/03/2017 20:33:43 DEBUG - ESjNJMoZnWHjMfVX-U463v2:/eMB_IVR_Member : VariableUtil.getVariableValue: Variable value is 12388
02/03/2017 20:33:43 DEBUG - ESjNJMoZnWHjMfVX-U463v2:/eMB_IVR_Member : CTICommand.execute: request to manager is https://urldefense.proofpoint.com/v2/url?u=http-3A__localhost-3A9082_aesconnector_Disconnect-3Fcallid-253D12388-2526extension-253D89114-26k-3DXHLGU0OPUf53ZmNdsPGOdh7bKYbJZNsA&d=DQIFJg&c=BFpWQw8bsuKpl1SgiZH64Q&r=oSKVie1FyZqBNq8gvMFwyA&m=EfZ_in6EZ1JjA9c5XgL8-P5nvyOm1PFZ92drSWorT2o&s=udL-sq2Ehh_E0liS7K9JVmuqqm2wNelfksJbrRAVSpo&e=
02/03/2017 20:33:43 DEBUG - ESjNJMoZnWHjMfVX-U463v2:/eMB_IVR_Member : CTICommand.execute: setting session cookie to ESjNJMoZnWHjMfVX-U463v2:/eMB_IVR_Member
02/03/2017 20:34:23 INFO - ESjNJMoZnWHjMfVX-U463v2:/eMB_IVR_Member : HTTP Session lost removing SCESession ESjNJMoZnWHjMfVX-U463v2
02/03/2017 20:34:23 INFO - ESjNJMoZnWHjMfVX-U463v2:/eMB_IVR_Member : SCESession Removed
02/03/2017 20:34:23 DEBUG - ESjNJMoZnWHjMfVX-U463v2:/eMB_IVR_Member : Removing 1 session stack frames.
02/03/2017 20:34:23 INFO - ESjNJMoZnWHjMfVX-U463v2:/eMB_IVR_Member : ** Popped Stack Frame [/eMB_IVR_Member]
---------------------------------------------------------------

At this time, if we sent CTI disconnect command from OD application, AES connector didn't reply to any those commands.
After 60 sec later, It seems that AES connector could receive this CTI disconnect command.
However this call was already disconnected. Then this CTI disconnect command was failed as "INVALID".
-----
> fsutap10_eMB_IVR_Member.trace.log
02/03/2017 20:33:43 DEBUG - ESjNJMoZnWHjMfVX-U463v2:/eMB_IVR_Member : CTICommand.execute: request to manager is http://localhost:9082/aesconnector/Disconnect?callid%3D12388%26extension%3D89114&k=XHLGU0OPUf53ZmNdsPGOdh7bKYbJZNsA
(60 sec later)
02/03/2017 20:34:43:577 DEBUG - CTIConnectorManager.dropCall: Call state of call 12388 is INVALID
-----

Based on JTAPI log(fsutap10_tsapi_trace.txt.1), we found below message.
Then we are thinking that AES connector tried to connect to both AES(Main and Sub) after link down to Sub AES.
-----
> fsutap10_tsapi_trace.txt.1
2017-03-02 20:33:41,734 [QuartzScheduler_Worker-1] INFO  TSProviderImpl - TSProvider.shutdown - already in shutdown, redundant call, returning.
2017-03-02 20:33:41,734 [QuartzScheduler_Worker-1] INFO  TSProviderImpl - TSProvider: version ‘6.1.0.94 [production build]’, for TSProvider[#3]@2aad1637
2017-03-02 20:33:41,734 [QuartzScheduler_Worker-1] INFO  TSProviderImpl - TSProvider: calling acsOpenStream serverID=AVAYA#AESSW4#CSTA#msatae4 loginID=vp passwd=******* for TSProvider[#3]@2aad1637
2017-03-02 20:33:41,734 [QuartzScheduler_Worker-1] DEBUG TsapiSessionFactoryOio - Attempting to connect to server </10.140.27.80:450>?* IP address of Main AES
2017-03-02 20:33:41,734 [QuartzScheduler_Worker-1] INFO  TsapiChannelOio - browser: GENERIC
2017-03-02 20:33:41,735 [QuartzScheduler_Worker-1] DEBUG TsapiSessionFactoryOio - Successfully  connected to server </10.140.27.80:450>
...
2017-03-02 20:33:41,774 [QuartzScheduler_Worker-1] DEBUG TsapiSessionFactoryOio - Attempting to connect to server </10.140.27.90:450>?* IP address of Sub AES
2017-03-02 20:33:41,774 [QuartzScheduler_Worker-1] INFO  TsapiChannelOio - browser: GENERIC

Best regards,
K.Yamahara
Kenji
Joined: Jan 22, 2015
Messages: 105
Offline
The attached file is some logs we corrected.
If you need other logs, please let me know.
Filename tsapi.zip [Disk] Download
WilsonYu
Joined: Nov 6, 2013
Messages: 3950
Offline
It looks like the aes connector gets the command "02/03/2017 20:33:43:496 DEBUG - getObserversFromCallId" from the app; however, it could not return until about a minute later "02/03/2017 20:34:43:525 DEBUG - getObserversFromCallId: done gathering observers". What I think happening is the the code in the same method is waiting (synchronized) on the list of observers being processed at the same time as the AES-EP-B is going down. So it looks like in certain case, there is a delay when the AES is going down. I also notice in your system it take 20 seconds in between each check on each AES "for availability". The monitor thread in AES connector checks every 8 seconds. So there is an unusual delay there to begin with.

02/03/2017 20:33:43:496 DEBUG - getObserversFromCallId: gathering observers... done waiting on sync list size of list is 187

02/03/2017 20:34:08:881 DEBUG - ProviderMonitor:AES-EP-B got an exception getting date/time attempt:0 error:no conf event
02/03/2017 20:34:08:881 DEBUG - ProviderMonitor:AES-EP-B got a null getting date/time attempt:0
02/03/2017 20:34:28:972 DEBUG - ProviderMonitor:AES-EP-B got an exception getting date/time attempt:1 error:send request failed
02/03/2017 20:34:28:972 DEBUG - ProviderMonitor:AES-EP-B got a null getting date/time attempt:1
02/03/2017 20:34:41:692 DEBUG - CTICallObserver.callChangedEvent: Number of events on extension 89114 is 5
02/03/2017 20:34:41:692 DEBUG - CTICallObserver.callChangedEvent 89114: Event: 0 is 117 for call 12388 Provider:AES-EP-A
02/03/2017 20:34:41:692 DEBUG - CTICallObserver.callChangedEvent:89114: got an unknown event 117
02/03/2017 20:34:41:692 DEBUG - CTICallObserver.callChangedEvent 89114: Event: 1 is 215 for call 12388 Provider:AES-EP-A
02/03/2017 20:34:41:692 DEBUG - CTICallObserver.callChangedEvent:89114: got an unknown event 215
02/03/2017 20:34:41:692 DEBUG - CTICallObserver.callChangedEvent 89114: Event: 2 is 107 for call 12388 Provider:AES-EP-A
02/03/2017 20:34:41:692 DEBUG - CTICallObserver.callChangedEvent:89114: got Connection Disconnected Event 107
02/03/2017 20:34:41:692 DEBUG - CTICallObserver.callChangedEvent 89114: Event: 3 is 205 for call 12388 Provider:AES-EP-A
02/03/2017 20:34:41:692 DEBUG - CTICallObserver.callChangedEvent:89114: got Connection Disconnected Event 205
02/03/2017 20:34:41:692 DEBUG - CTICallObserver.callChangedEvent 89114: Event: 4 is 103 for call 12388 Provider:AES-EP-A
02/03/2017 20:34:41:692 DEBUG - CTICallObserver.callChangedEvent:89114: got Call Observation Ended Event
02/03/2017 20:34:41:692 DEBUG - CTICallObserver.removeCall:clearing MRCR:12388 from ext:89114
02/03/2017 20:34:41:692 DEBUG - CTICallObserver.removeCall:clearing MRCC:12388 from ext:89114
02/03/2017 20:34:41:692 DEBUG - CTICallObserver.removeCall:removing call:12388 from ext:89114

02/03/2017 20:34:43:525 DEBUG - getObserversFromCallId: done gathering observers
Kenji
Joined: Jan 22, 2015
Messages: 105
Offline
Hi Wilson-san,

Thank you of for your reply.

I understood your explanation.
Then we tried to reproduce this test some times.

In this case, we confirmed that we can reproduce it some times.
And it seems that every times the list of observers being process was happen at the same time as the AES-EP-B is going down.
So, we think that it is not rare case.

Futhermore, we are thinking that the list of observers being process was happen when AES connector detect the AES-EP-B is going down.
Therefore we are thinking that this is root cause for this phenomenon.
Are there any possibilities to happen it when AES connector detect the AES-EP-B is going down?
If yes, could you advice the mechanism of the list of observers being process was happen.

Best regards, K.Yamahara
WilsonYu
Joined: Nov 6, 2013
Messages: 3950
Offline
Not sure I understand your question. The AES connector is detecting AES-EP-B is going down and releasing the observers attached to it. That allows the list of observers of AES-EP-A to take over the call. However what is witnessed here is there is a delay in the method that returns the AES-EP-A observer for the call extension. As I said earlier, the reason for the delay is probably the method is waiting on process in which the list of observers are being cleaned up.
Kenji
Joined: Jan 22, 2015
Messages: 105
Offline
Hi Wilson-san,

Thank you for your reply. And I'm so sorry for my poor explanation.

As I said tha we are using 2 * AES(AES-EP-A and AES-EP-B).
Those are working as standalone without any HA configuration.
In this configuartion, AES-EP-A is Primary(Main) AES for AES connector.
And AES-EP-B is Secondary(Sub) AES for same AES connector.
Therefore we are thinking that all of Data and control signals are transmitted on AES-EP-A at that time.
The main perpose of this test is no effects to Main AES(AES-EP-A) connection even if Sub AES(AES-EP-B) going down.

However the list of observers being process was happen when AES connector detect the AES-EP-B is going down.
As a results, AES connector didn't answer for any CTI command.
Finally, existing call on Main AES(AES-EP-A) connection was disconnected.
If existing call was transmitted on Sub AES(AES-EP-B) connection, there are some possibilities to take over it to Main AES(AES-EP-A) connection.
But we believe that existing call was transmitted on Main AES(AES-EP-A) connection.
And there is no effects for it even if Sub AES(AES-EP-B) is going down.

So, I would like to know that reason why the AES connector is releasing the observers attached to it when the AES connector detected AES-EP-B is going down.
And I also would like to know this allows the list of observers of AES-EP-A to take over the call even if the existing call on Main AES(AES-EP-A).
Futhermore are there any work arrounds to resolve this issue, we would like to know as well.

Best regards, K.Yamahara
WilsonYu
Joined: Nov 6, 2013
Messages: 3950
Offline
I was focusing too much on the AES Connector earlier and didn't take a look at the jtapi. It turned out you have exceptions like the following starting 20:33. The communication error showing in JTAPI is the real culprit of the delay and other problem. I am not the expert in this component. You would have to talk to the JTAPI or AES folk to find out more.

2017-03-02 20:34:23,303 [GetEventThread] ERROR TSEventHandler - Event Distributor Exception - shutting down provider TSProvider[#4]@dde2c2a6
2017-03-02 20:34:23,303 [GetEventThread] ERROR TSEventHandler - read request failed
com.avaya.jtapi.tsapi.TsapiSocketException: read request failed
at com.avaya.jtapi.tsapi.tsapiInterface.TsapiSession.handleException(TsapiSession.java:932)
at com.avaya.jtapi.tsapi.tsapiInterface.oio.TsapiChannelOio$OioChannelThread.run(TsapiChannelOio.java:261)
Caused by: java.net.SocketException: ƒŠƒ‚?[ƒg?EƒzƒXƒg‚̓^ƒCƒ€ƒAƒEƒgŠúŠÔ“à‚ɉž“š‚µ‚Ü‚¹‚ñ‚Å‚µ‚½?B
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:161)
at java.net.SocketInputStream.read(SocketInputStream.java:132)
at com.avaya.jtapi.tsapi.tsapiInterface.streams.IntelSocketInputStream.readFully(IntelSocketInputStream.java:49)
at com.avaya.jtapi.tsapi.tsapiInterface.streams.IntelSocketInputStream.readInt(IntelSocketInputStream.java:60)
at com.avaya.jtapi.tsapi.tsapiInterface.oio.TsapiChannelOio$OioChannelThread.run(TsapiChannelOio.java:255)
Kenji
Joined: Jan 22, 2015
Messages: 105
Offline
Hi Wilson-san,

Thank you for your big help.
Now, I could got one of good forcus point from you.
I will talk your explained point with JTAPI and AES guys on it.
If I need your help again, I will back here again.

Actually, we have discussed this on JTAPI forum and SR as below.
However I could not make this clear. Then I have rised this ticket here.
https://www.devconnectprogram.com/forums/posts/list/20949.page
SR#1-12663645209?Secondary AES down affects Primary AES connection on AES Connector

Thank you and regards, K.Yamahara
Go to:   
Mobile view