


How to implement a retry strategy from serverB to serverC using Spring WebFlux when building LLM gateway?
Apr 19, 2025 pm 04:30 PMRetry mechanism for building LLM gateway using Spring WebFlux
When building an LLM gateway, communication between services needs to be handled and ensure that when a service is unavailable, it is possible to switch to the backup service seamlessly. This article will explore how to achieve this using Spring WebFlux, especially if gateway to Server B communication fails, how to retry and connect to Server C.
Scene description
Our LLM gateway call link is: Client-> Gateway-> Server B. If the gateway connection to Server B fails, we want the gateway to be able to retry and connect to Server C. This requires that the gateway can capture the error response code of Server B and automatically switch to Server C on failure.
Code analysis and improvement solutions
Let's first look at the original sseHttp
method, which handles gateway requests to Server B or Server C:
Flux<response> responseFlux = webClient.create(url) .post() .headers(httpHeaders -> setHeaders(httpHeaders, headers)) .contentType(MediaType.APPLICATION_JSON) .bodyValue(jsonBody) .retrieve() .onStatus(status -> status != HttpStatus.OK, response -> { // Error handling logic}) // ...Other logic...</response>
In order to implement the retry strategy, we need to capture the error response code of Server B and switch to Server C when an error occurs. There are some problems with previous attempts: simple try-catch
cannot catch errors inside Flux
; the subscribe
method is non-blocking, resulting in the error handling logic not taking effect in time.
Best Practice: Utilize retryWhen
and onErrorResume
To solve the above problem, we should take advantage of retryWhen
and onErrorResume
operators provided by Spring WebFlux.
First, modify the sseHttp
method and add retry logic:
Flux<response> sseHttp(String url) { return webClient.create(url) .post() .headers(httpHeaders -> setHeaders(httpHeaders, headers)) .contentType(MediaType.APPLICATION_JSON) .bodyValue(jsonBody) .retrieve() .onStatus(HttpStatus::isError, clientResponse -> { // Record error logs to facilitate debugging return Mono.error(new WebClientResponseException("Server returned error status: " clientResponse.rawStatusCode(), clientResponse.rawStatusCode(), clientResponse.headers().asHttpHeaders(), clientResponse.bodyToMono(String.class).block(), null)); }) .bodyToFlux(typeRef) .retryWhen(Retry.backoff(3, Duration.ofSeconds(1)) .filter(throwable -> throwable instanceof WebClientResponseException) .onRetryExhaustedThrow((spec, signal) -> new GatewayException("Failed to connect to both Server B and Server C after multiple retries."))); }</response>
This code uses onStatus
to process HTTP error status codes and retry with retryWhen
, retry up to 3 times, each time interval of 1 second. filter
ensures that only exceptions of type WebClientResponseException
are retryed. If the number of retrys is exhausted, GatewayException
is thrown.
Then, where sseHttp
is called, use onErrorResume
to handle the failure of Server B and switch to Server C:
Mono<response> responseMono = sseHttp(serverBUrl) .onErrorResume(WebClientResponseException.class, ex -> { log.warn("Failed to connect to Server B: {}", ex.getMessage()); // Log error log return sseHttp(serverCUrl); }) .next();</response>
This code first tries to connect to Server B, and if WebClientResponseException
occurs, it tries to connect to Server C. The next()
method ensures that only one result is returned.
Handle multiple successful responses
If both Server B and Server C successfully return data, we need to make sure that only one response is processed. An AtomicBoolean
variable can be used to track whether the response has been processed successfully:
AtomicBoolean success = new AtomicBoolean(false); Flux<response> sseHttp(String url) { // ... (previous code) ... .doOnNext(response -> { if (success.compareAndSet(false, true)) { // Processing a successful response} }) // ... (rest of the code) ... }</response>
Through the above improvements, we have implemented a more robust retry mechanism that can effectively handle communication failures between services and ensure high availability of LLM gateways. Remember to add sufficient logging to facilitate troubleshooting.
The above is the detailed content of How to implement a retry strategy from serverB to serverC using Spring WebFlux when building LLM gateway?. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undress AI Tool
Undress images for free

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics

As the market conditions pick up, more and more smart investors have begun to quietly increase their positions in the currency circle. Many people are wondering what makes them take decisively when most people wait and see? This article will analyze current trends through on-chain data to help readers understand the logic of smart funds, so as to better grasp the next round of potential wealth growth opportunities.

The pattern in the public chain field shows a trend of "one super, many strong ones, and a hundred flowers blooming". Ethereum is still leading with its ecological moat, while Solana, Avalanche and others are challenging performance. Meanwhile, Polkadot, Cosmos, which focuses on interoperability, and Chainlink, which is a critical infrastructure, form a future picture of multiple chains coexisting. For users and developers, choosing which platform is no longer a single choice, but requires a trade-off between performance, cost, security and ecological maturity based on specific needs.

Cardano's Alonzo hard fork upgrade has successfully transformed Cardano from a value transfer network to a fully functional smart contract platform by introducing the Plutus smart contract platform. 1. Plutus is based on Haskell language, with powerful functionality, enhanced security and predictable cost model; 2. After the upgrade, dApps deployment is accelerated, the developer community is expanded, and the DeFi and NFT ecosystems are developing rapidly; 3. Looking ahead to 2025, the Cardano ecosystem will be more mature and diverse. Combined with the improvement of scalability in the Basho era, the enhancement of cross-chain interoperability, the evolution of decentralized governance in the Voltaire era, and the promotion of mainstream adoption by enterprise-level applications, Cardano has

Recently, Bitcoin hit a new high, Dogecoin ushered in a strong rebound and the market was hot. Next, we will analyze the market drivers and technical aspects to determine whether Ethereum still has opportunities to follow the rise.

The five most valuable stablecoins in 2025 are Tether (USDT), USD Coin (USDC), Dai (DAI), First Digital USD (FDUSD) and TrueUSD (TUSD).

Stablecoins are crypto assets that maintain price stability by anchoring fiat currencies such as the US dollar. They are mainly divided into three categories: fiat currency collateral, crypto asset collateral and algorithmic stablecoins. 1. USDT is issued by Tether and is the stablecoin with the largest market value and the highest liquidity. 2. USDC is released by the Centre alliance launched by Circle and Coinbase, and is known for its transparency and compliance. 3. DAI is generated by MakerDAO through over-collateralization of crypto assets and is the core currency in the DeFi field. 4. BUSD was launched in partnership with Paxos, and is regulated by the United States but has been discontinued. 5. TUSD achieves high transparency reserve verification through third-party escrow accounts. Users can use centralized exchanges such as Binance, Ouyi, and Huobi

The top 20 most promising crypto assets in 2025 include BTC, ETH, SOL, etc., mainly covering multiple tracks such as public chains, Layer 2, AI, DeFi and gaming. 1.BTC continues to lead the market with its digital yellow metallicity and popularization of ETFs; 2.ETH consolidates the ecosystem due to its position and upgrade of smart contract platforms; 3.SOL stands out with high-performance public chains and developer communities; 4.LINK is the leader in oracle connecting real data; 5.RNDR builds decentralized GPU network service AI needs; 6.IMX focuses on Web3 games to provide a zero-gas-free environment; 7.ARB leads with mature Layer 2 technology and huge DeFi ecosystem; 8.MATIC has become the value layer of Ethereum through multi-chain evolution

At a time when the digital economy wave swept the world, cryptocurrencies have become the focus of attention from all walks of life with their unique decentralization and transparency. From the initial geek niche experiment to the current financial landscape with a market value of trillions, the evolution of cryptocurrencies is amazing. It not only brings innovations in underlying technologies, but also gives birth to countless innovative applications, which are profoundly affecting all aspects of finance, technology and even social governance.
