We're trying to move from standalone redis to redis High availability mode. However while load testing in dev environment we're receiving below error.
level=ERROR time="07-03-2023 20:54:35" traceId="5a65a85d2f6ccaa8" logger=GlobalExceptionMapper message="GlobalExceptionMapper: java.util.concurrent.CompletionException: io.vertx.core.impl.NoStackTraceThrowable: No more endpoints in chain. at io.smallrye.mutiny.operators.uni.UniBlockingAwait.await(UniBlockingAwait.java:73) at io.smallrye.mutiny.groups.UniAwait.atMost(UniAwait.java:61) at io.quarkus.redis.client.runtime.RedisClientImpl.await(RedisClientImpl.java:1026) at io.quarkus.redis.client.runtime.RedisClientImpl.set(RedisClientImpl.java:672) at io.quarkus.redis.client.RedisClient_761b9a6e5f634178e3291b09c1921f229025da0c_Synthetic_ClientProxy.set(RedisClient_761b9a6e5f634178e3291b09c1921f229025da0c_Synthetic_ClientProxy.zig:2298) at <package>.services.SequenceService.isKeyPresentForDuplicityCheck(SequenceService.java:34) at <package>.services.SequenceService_ClientProxy.isKeyPresentForDuplicityCheck(SequenceService_ClientProxy.zig:256) at <package>.services.AccountsApiServiceImpl.createCustomerAccount(AccountsApiServiceImpl.java:128) at <package>.services.AccountsApiServiceImpl_Subclass.createCustomerAccount$$superforward1(AccountsApiServiceImpl_Subclass.zig:197) at <package>.services.AccountsApiServiceImpl_Subclass$$function$$2.apply(AccountsApiServiceImpl_Subclass$$function$$2.zig:33) at io.quarkus.arc.impl.AroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:54) at io.quarkus.narayana.jta.runtime.interceptor.TransactionalInterceptorBase.invokeInOurTx(TransactionalInterceptorBase.java:132) at io.quarkus.narayana.jta.runtime.interceptor.TransactionalInterceptorBase.invokeInOurTx(TransactionalInterceptorBase.java:103) at io.quarkus.narayana.jta.runtime.interceptor.TransactionalInterceptorRequired.doIntercept(TransactionalInterceptorRequired.java:38) at io.quarkus.narayana.jta.runtime.interceptor.TransactionalInterceptorBase.intercept(TransactionalInterceptorBase.java:57) at io.quarkus.narayana.jta.runtime.interceptor.TransactionalInterceptorRequired.intercept(TransactionalInterceptorRequired.java:32) at io.quarkus.narayana.jta.runtime.interceptor.TransactionalInterceptorRequired_Bean.intercept(TransactionalInterceptorRequired_Bean.zig:340) at io.quarkus.arc.impl.InterceptorInvocation.invoke(InterceptorInvocation.java:41) at io.quarkus.arc.impl.AroundInvokeInvocationContext.perform(AroundInvokeInvocationContext.java:41) at io.quarkus.arc.impl.InvocationContexts.performAroundInvoke(InvocationContexts.java:32) at <package>.services.AccountsApiServiceImpl_Subclass.createCustomerAccount(AccountsApiServiceImpl_Subclass.zig:404) at <package>.services.AccountsApiServiceImpl_ClientProxy.createCustomerAccount(AccountsApiServiceImpl_ClientProxy.zig:659) at <package>.resources.AccountsApi.createCustomerAccount(AccountsApi.java:48) at <package>.resources.AccountsApi$quarkusrestinvoker$createCustomerAccount_0b915408532d6a09a8c6a63ae490a49fe854ecb6.invoke(AccountsApi$quarkusrestinvoker$createCustomerAccount_0b915408532d6a09a8c6a63ae490a49fe854ecb6.zig:39) at org.jboss.resteasy.reactive.server.handlers.InvocationHandler.handle(InvocationHandler.java:29) at org.jboss.resteasy.reactive.server.handlers.InvocationHandler.handle(InvocationHandler.java:7) at org.jboss.resteasy.reactive.common.core.AbstractResteasyReactiveContext.run(AbstractResteasyReactiveContext.java:141) at io.quarkus.vertx.core.runtime.VertxCoreRecorder$13.runWith(VertxCoreRecorder.java:543) at org.jboss.threads.EnhancedQueueExecutor$Task.run(EnhancedQueueExecutor.java:2449) at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.run(EnhancedQueueExecutor.java:1478) at org.jboss.threads.DelegatingRunnable.run(DelegatingRunnable.java:29) at org.jboss.threads.ThreadLocalResettingRunnable.run(ThreadLocalResettingRunnable.java:29) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) at java.lang.Thread.run(Thread.java:829) at com.oracle.svm.core.thread.JavaThreads.threadStartRoutine(JavaThreads.java:567) at com.oracle.svm.core.posix.thread.PosixJavaThreads.pthreadStartRoutine(PosixJavaThreads.java:192)Caused by: io.vertx.core.impl.NoStackTraceThrowable: No more endpoints in chain.
How load testing is being performed?
We're calling posting 30K requests from JMeter with below configuration. While the requests are being processed, we delete the redis master instance one by one (when the previously deleted pod is completely restarted).The master redis instance is deleted using below commandkubectl -n <namespace> delete pod redis-0
Sentinels are able to perform a complete failover when a master redis instance is deleted and a new redis master is selected. In most cases, a few requests are failed and after a short period of time new requests start executing successfully. However, for some cases, the Jmeter hangs for a while and then the error 'io.vertx.core.impl.NoStackTraceThrowable: No more endpoints in chain' is shown for every remaining request.
System Configuration
Quarkus Process which is connecting to redis
All the configuration are present in process' application.properties file
Things we've tried
- Tuning the quarkus redis parameters while load testing. We've executed the load test about 20 times now with changes in redis paramters. All but 1 load tests have resulted in this issue
- There is a way to reconnect on error in vertx redis client. Abandoned this approach as there was lot of code change involved to migrate to vertx redis client
Quarkus Process Details
- Traffic of ~300 tps is expected on this api
- 1 pod was running in dev environment while load testing. In production, there are 5.
- Quarkus version: <quarkus.platform.version>2.3.0.Final</quarkus.platform.version>
- Redis dependency
<dependency><groupId>io.quarkus</groupId><artifactId>quarkus-redis-client</artifactId></dependency>
- Last time we observed this issue was when we provided incorrect connection string to sentinel
- Redis configuration in the process. These values are overridden by the values in kubernetes deployment file
quarkus.redis.hosts=redis://sentinel:5000quarkus.redis.client-type=sentinelquarkus.redis.password=quarkus.redis.timeout=5Squarkus.redis.max-pool-size=50quarkus.redis.max-pool-waiting=5000quarkus.redis.pool-cleaner-interval=5Squarkus.redis.pool-recycle-timeout=5Squarkus.redis.reconnect-attempts=5quarkus.redis.reconnect-interval=30Squarkus.redis.max-waiting-handlers=20
I'm seeking assistance on how to resolve this issue as I've exhausted all ideas to resolve it. I would appreciate any guidance on the matter, including suggestions on load testing methods and configurations.