Thursday, February 12, 2026

OmniHai grows ears

OmniHai 1.1 is here! This release brings audio transcription, smarter conversation memory, automatic file cleanup, gzip compression, and a pile of hardening across the board.

If you missed the earlier posts: OmniHai is a lightweight Java utility library that provides a unified API across multiple AI providers for Jakarta EE and MicroProfile applications. Check out the introduction, streaming & custom handlers, and 1.0 release posts for the full backstory.

Here are the Maven coordinates:

<dependency>
    <groupId>org.omnifaces</groupId>
    <artifactId>omnihai</artifactId>
    <version>1.1</version>
</dependency>

Audio Transcription

OmniHai now transcribes audio. Just pass in the bytes:

byte[] audio = Files.readAllBytes(Path.of("meeting.wav"));
String transcription = service.transcribe(audio);

That's it. Supported formats include WAV, MP3, MP4, FLAC, AAC, AIFF, OGG, and WebM. The async variant transcribeAsync() is also available, like all other methods in AIService.

Providers with a native transcription API (OpenAI, Mistral, Hugging Face) use it directly for best accuracy. All other providers fall back to a chat-based approach where the audio is attached to a carefully crafted transcription prompt. This means transcription works everywhere, even on providers that don't have a dedicated endpoint for it. Integration tests are also caught up, and they all pass.

A new AIAudioHandler interface joins the existing AITextHandler and AIImageHandler for customization. The default handler produces a verbatim plain-text transcription, but you might want something different. For example: a medical or legal transcription handler that includes domain-specific terminology hints in the prompt, a handler that adds speaker labels and timestamps, or one that outputs SRT/VTT subtitle format instead of plain text. You can plug in your own via @AI(audioHandler = MyAudioHandler.class) or programmatically through AIStrategy. Speaking of which, AIStrategy now has convenient factory methods:

AIStrategy strategy = AIStrategy.of(MyTextHandler.class);
AIStrategy strategy = AIStrategy.of(MyTextHandler.class, MyImageHandler.class, MyAudioHandler.class);

Smarter Conversation Memory

As a reminder: OmniHai's conversation memory is fully caller-owned. There's no server-side session state, no database, no memory leaks, no lifecycle management to worry about. History lives in your ChatOptions instance, not in the service. You control it, you scope it, you discard it. This remains one of OmniHai's key design advantages.

In 1.0, memory kept everything. That's fine for short conversations, but eventually you'll hit the provider's context window. In 1.1, history is maintained as a sliding window with a default of 20 messages (10 conversational turns). Oldest messages are automatically evicted when the limit is exceeded:

ChatOptions options = ChatOptions.newBuilder()
    .withMemory(50) // Keep up to 50 messages (25 turns)
    .build();

File attachments are now tracked in history too. When you upload files in a memory-enabled chat, their references are preserved across turns so the AI can continue referencing previously uploaded documents:

ChatOptions options = ChatOptions.newBuilder()
    .withMemory()
    .build();

ChatInput input = ChatInput.newBuilder()
    .message("Analyze this PDF")
    .attach(Files.readAllBytes(Path.of("report.pdf")))
    .build();

String analysis = service.chat(input, options);
String followUp = service.chat("What's on page 2?", options); // AI still has access to the PDF

When messages slide out of the window, their associated file references are evicted as well. File tracking in history requires the AI provider to support a files API, which is currently the case for OpenAI(-compatible) providers, Anthropic, and Google AI.

Automatic File Cleanup

This one's a behind-the-scenes improvement that you don't have to think about, and that's the point ;) When you upload files via the chat API, they end up on the provider's servers. Some providers automatically clean up these after a day or two, or support expiration metadata, but there are providers which don't support expiration let alone automatic clean up. OmniHai now handles this: uploaded files are automatically cleaned up in the background after 2 days in a fire-and-forget task. Only files uploaded by OmniHai are touched. No configuration needed.

Gzip Compression

All HTTP responses from AI providers are now transparently decompressed when gzip-encoded. OmniHai sends Accept-Encoding: gzip on every request and handles the decompression automatically. This reduces bandwidth usage, which is particularly nice for those verbose JSON responses that AI providers love to return.

Under the Hood

Beyond the headline features, 1.1 includes a bunch of improvements:

  • ChatOptions#withSystemPrompt() creates a copy of existing options with a different system prompt, useful for reusing a base configuration across different use cases.
  • Improved exception messages now indicate the caller's stack trace, making it easier to pinpoint where things went wrong in async flows.
  • Hardened file upload handling across providers, especially for Mistral compatibility.
  • The OpenRouterAITextHandler was dropped entirely as improved file upload handling in the base OpenAITextHandler made it redundant.
  • Updated the default OpenAI model version.
  • Various javadoc fixes and additional unit/integration tests.

Give It a Try

As always, feedback and contributions are welcome on GitHub. If you run into any issues, open an issue. Pull requests are welcome too.

No comments: