Friday, March 6, 2026

OmniHai goes online

OmniHai 1.3 is out. After 1.1 gave the library ears to transcribe audio and 1.2 gave it a voice to speak, 1.3 lets it step outside and look around. Web search is now a first-class citizen in the API, alongside token usage tracking, an AIServiceWrapper, and a handful of internal improvements.

<dependency>
    <groupId>org.omnifaces</groupId>
    <artifactId>omnihai</artifactId>
    <version>1.3</version>
</dependency>

Web Search

AI models are great at reasoning over things they already know, but their knowledge has a cutoff date. Web search bridges this by allowing the model retrieve up-to-date information from the internet before formulating its answer. OmniHai 1.3 adds dedicated webSearch() method to AIService for exactly this purpose.

The simplest form is a single method call:

String answer = service.webSearch("What is the current stock price of Nvidia?");

That's it. The model searches the web, sources its answer, and returns a response based on current information rather than whatever it was trained on.

If you need results scoped to a specific geographic location, e.g. local news, weather forecasts, available restaurants, or store prices, then pass a Location:

Location miami = new Location("US", "Florida", "Miami");
String weather = service.webSearch("What is the weather like today?", miami);

Location takes a country code, region, and city, all optional. You can pass Location.GLOBAL if you want web search enabled but without any geographical restriction, which is also the default when you call the webSearch() method without location argument.

Structured Web Search Output

Like the regular chat() methods, webSearch() also supports typed responses. Define a record that represents the shape of the data you want, and pass it as class argument:

public record StockPrice(String ticker, BigDecimal price, String currencyCode) {}

StockPrice tsla = service.webSearch("What is the current stock price of Tesla?", StockPrice.class);

Here's another example:

public record Link(String title, String url) {}
public record Links(List items) {}

Links results = service.webSearch("Latest 5 news headlines about Jakarta EE", Links.class);
results.items().forEach(link -> System.out.println(link));

Example output:

Link[title=Java News Roundup: Jakarta EE 12, Spring Shell, Open Liberty ..., url=https://www.infoq.com/news/2026/02/java-news-roundup-jan26-2026]
Link[title=Jakarta EE 12 - The Eclipse Foundation, url=https://jakarta.ee/zh/release/12]
Link[title=Jakarta EE 12 M2 — Welcome to the Data Age, url=https://www.linkedin.com/pulse/jakarta-ee-12-m2-welcome-data-age-otavio-santana-zkgie]
Link[title=Jakarta EE 2025: a year of growth, innovation, and global engagement, url=https://blogs.eclipse.org/post/tatjana-obradovic/jakarta-ee-2025-year-growth-innovation-and-global-engagement]
Link[title=The Eclipse Foundation Releases the 2025 Jakarta EE Developer Survey Report, url=https://newsroom.eclipse.org/news/announcements/eclipse-foundation-releases-2025-jakarta-ee-developer-survey-report]

Of course, the webSearch() has also an async counterpart: webSearchAsync().

Web Search inside Chat

Sometimes you want web search as part of a larger chat flow rather than a standalone query. For those cases, you can use ChatOptions.Builder.webSearch() or webSearch(Location) methods, or the withWebSearch(location) copy method, so you can for example mix live web data into a memory-enabled conversation without leaving the chat API:

ChatOptions options = ChatOptions.newBuilder()
    .systemPrompt("You are a helpful travel assistant.")
    .withMemory()
    .webSearch()
    .build();

String response = service.chat("What are the current visa requirements for Dutch citizens visiting Japan?", options);
String followUp = service.chat("And what about travelling to South Korea from there?", options);

You can also derive a web-search-enabled or -disabled copy from an existing options instance without rebuilding from scratch:

ChatOptions withGlobalSearch = options.withWebSearch(Location.GLOBAL);
ChatOptions withLocalSearch = options.withWebSearch(new Location("CW", null, "Willemstad"));
ChatOptions withoutSearch = options.withWebSearch(null); // disables web search

Provider Support

Web search is supported on OpenAI (via the Responses API), Google AI, Anthropic (Claude 4 and later), xAI, Azure AI, and OpenRouter. OpenRouter handles this slightly differently from the rest: rather than a dedicated tool call, it activates web search by appending :online to the model name (e.g. openai/gpt-4o:online). OmniHai handles that detail internally with help of the new AIServiceWrapper decorator (more on this later); you just call webSearch() and it works. xAI always had it implicitly enabled when the model (Grok) realizes that the caller is asking for real time information (e.g. "current weather" or "current stock price"), so not really much of a change, except for that you can now force it to perform a web search when using non-obvious prompts. Mistral supports web search but only via a separate Agents API rather than the Chat Completions API, so OmniHai can't do much. If the underlying provider or model does not support web search, an UnsupportedOperationException is thrown, consistent with how other unsupported capabilities are handled in the library.

Token Usage Tracking

Every AI call costs tokens, and until now OmniHai gave you no visibility into how many. That changes in 1.3 with the introduction of ChatUsage, a record that reports the input, output, and reasoning token counts for each call:

ChatOptions options = ChatOptions.newBuilder()
    .systemPrompt("You are a helpful assistant.")
    .build();

String response = service.chat("Explain the visitor pattern in one paragraph.", options);
ChatUsage usage = options.getLastUsage();
System.out.printf("Tokens in: %d, out: %d, total: %d%n", usage.inputTokens(), usage.outputTokens(), usage.totalTokens());

reasoningTokens() is also available for providers and models that report internal thinking separately (such as OpenAI o-series, Anthropic extended thinking models, and Grok reasoning models). It is always a subset of outputTokens(), so totalTokens() does not add it separately to avoid double-counting. A value of -1 on any field means the AI provider did not report that number.

ChatUsage is stored on the ChatOptions instance itself, which brings up an important subtlety. The three shared constants ChatOptions.DEFAULT, ChatOptions.CREATIVE, and ChatOptions.DETERMINISTIC are immutable. If you wish to record usage on your instance, call copy() first to get a mutable instance with the same settings:

ChatOptions options = ChatOptions.DEFAULT.copy();
service.chat("Hello!", options);
ChatUsage usage = options.getLastUsage();

Any ChatOptions instance you build yourself via newBuilder() is always mutable and can track usage directly.

AIServiceWrapper

The new AIServiceWrapper is an abstract decorator base class that makes it straightforward to wrap any AIService implementation and intercept specific methods. All methods delegate to the wrapped service by default, so you only override what you actually care about.

A practical example is a provider fallback: try the primary service, and if it responds with a rate limit or is temporarily unavailable, then transparently retry on a backup provider instead of propagating the exception to the caller.

public class FallbackAIService extends AIServiceWrapper {

    private final AIService fallback;

    public FallbackAIService(AIService primary, AIService fallback) {
        super(primary);
        this.fallback = fallback;
    }

    @Override
    public CompletableFuture<String> chatAsync(ChatInput input, ChatOptions options) throws AIException {
        return super.chatAsync(input, options).exceptionallyCompose(completionException -> {
            var cause = completionException.getCause();

            if (cause instanceof AIRateLimitExceededException || cause instanceof AIServiceUnavailableException) {
                return fallback.chatAsync(input, options);
            }

            return CompletableFuture.failedFuture(completionException);
        });
    }
}

Wire it up by injecting two services and composing them:

@Inject @AI(apiKey = "#{keys.openai}")
private AIService gpt;

@Inject @AI(provider = ANTHROPIC, apiKey = "#{keys.anthropic}")
private AIService claude;

AIService resilient = new FallbackAIService(gpt, claude);
String response = resilient.chat("Explain the Jakarta EE security model.");

From the caller's perspective it is just an AIService. The fallback logic is entirely self-contained in the wrapper. You can add overloads for transcribe, generateImage, or any other method you want covered, or leave them delegating to the primary and let the caller handle those exceptions as normal.

Give It a Try

As always, feedback and contributions are welcome on GitHub. If you run into any issues, open an issue. Pull requests are welcome too.

No comments: