Thursday, April 23, 2026

OmniFaces 5.3 released!

OmniFaces 5.3 has been released!

This is a relatively small feature release on top of 5.2. One new component has been added, and the rest of the changes are under the hood for better long-term maintenance: automated code formatting, a simpler and faster JavaScript build, and reorganized TypeScript sources. You can find the complete list of additions, changes and fixes at What's new in OmniFaces 5.3? in the showcase.

New: <o:lazyPanel>

Ever had a page with an expensive region below the fold which the user may never scroll to, but which gets built on every page load anyway? The traditional workaround is to wire up an IntersectionObserver in custom JavaScript and fire an ajax request yourself, which is a lot of boilerplate for something so common.

The new <o:lazyPanel> defers rendering of its children until the panel has scrolled into view:

<o:lazyPanel>
    <h:dataTable value="#{bean.expensiveList}" var="row">
        ...
    </h:dataTable>
</o:lazyPanel>

On initial render, the component writes a wrapper element with an optional placeholder and schedules a viewport intersection listener on it via OmniFaces.js, which uses IntersectionObserver when available and falls back to scroll/resize/orientationChange listeners otherwise. As soon as the wrapper intersects the viewport, a single faces.ajax.request targeting its own client id is fired. The component then flips its loaded flag, optionally invokes a listener bean method with a LazyPanelEvent, and renders its children in place of the placeholder.

The loaded attribute is a server-side escape hatch: when true, the children are rendered immediately without any client side observer. This is useful for print views, SEO crawlers, or tests.

<o:lazyPanel loaded="#{bean.printPreview}">
    ...
</o:lazyPanel>

Nested <f:param> or <o:param> children are sent along with the lazy panel ajax request, so that a single listener can serve multiple panels by distinguishing on an entity id, filter key, or page number:

<o:lazyPanel listener="#{bean.preload}">
    <f:param name="productId" value="#{product.id}" />
    ...
</o:lazyPanel>

The listener can read them via Faces#getRequestParameter(). Parameter values are evaluated at initial render (snapshot semantics), consistent with UIParameter usage elsewhere in Faces.

The closest equivalent in PrimeFaces is <p:outputPanel deferred="true" deferredMode="visible">, which loads its contents once the panel is scrolled into view. Under the hood however it uses jQuery scroll handlers on the window combined with $.offset() and window height math, which fires on every scroll event and scales poorly when you have multiple deferred panels on the same page. <o:lazyPanel> uses the native IntersectionObserver which is browser-native, more efficient, and only observes the panel itself; it falls back to scroll/resize/orientationChange listeners only when IntersectionObserver is unavailable. <o:lazyPanel> also has no jQuery or PrimeFaces runtime dependency, it's just a standard faces.ajax.request, so it works in vanilla Faces applications without PrimeFaces. On top of that, <o:lazyPanel> supports <f:param>/<o:param> for passing context to the listener, which <p:outputPanel> does not natively offer.

The homegrown alternative is to wire up an IntersectionObserver yourself which then calls a <h:commandScript> or <p:remoteCommand> from the intersection callback, and to manually swap placeholder markup on response. This works, but it's imperative JavaScript scattered across the view, and you'll have to repeat it for every lazy region. <o:lazyPanel> is the declarative equivalent: one tag, no JavaScript, and the placeholder and listener are just regular Faces markup.

Under the hood

Relatively a lot of things have been cleaned up in the build and source tree. These have no impact on runtime behavior, but they do make the project easier to maintain and contribute to:

  • Automated code formatting via Spotless and Stylistic; all Java, XML, XHTML and TypeScript sources are now formatted consistently on every build. This avoids inconsistently formatted source code coming in with pull requests.
  • The TypeScript sources have been reorganized into their own src/main/ts subfolder. This keeps the context of src/main/webapp clean.
  • The JavaScript build has been improved: browserify and closure-compiler-maven-plugin have been replaced by esbuild for performance and simplicity.
  • Vdlgen now also runs during Eclipse incremental builds, so workspace resolution into sandbox projects continues to work.

Fixes

MultiViews welcome file resolution failed on Windows-based servers due to wrong parent path handling. This has been fixed (#949).

<o:validateBean> did not collect nested properties of @Valid-annotated beans, so validation could silently skip nested constraints. This has been fixed (#951).

@ViewScoped unload threw a NullPointerException during pending view state removal in the specific combination of Spring WebFlow with MyFaces. This has been fixed (#952).

Installation

Non-Maven users: download OmniFaces 5.3.2 JAR and drop it in /WEB-INF/lib the usual way, replacing the older version if any.

Maven users: add below entry to pom.xml, replacing the older version if any.

<dependency>
    <groupId>org.omnifaces</groupId>
    <artifactId>omnifaces</artifactId>
    <version>5.3.2</version>
</dependency>

The 5.3.2 fixes have also been backported to 4.x and 3.x, so OmniFaces 4.7.8 and OmniFaces 3.14.19 have been released as well.

Monday, April 20, 2026

OmniHai counts the cost

OmniHai 1.4 is out! After 1.1 gave the library ears, 1.2 a voice, and 1.3 the ability to step outside and browse the web, 1.4 teaches it to count. Token usage becomes actual money, runaway spend can be capped, reasoning effort is now dial-able across providers, and ChatOptions knows how to serialize itself to portable JSON.

<dependency>
    <groupId>org.omnifaces</groupId>
    <artifactId>omnihai</artifactId>
    <version>1.4</version>
</dependency>

Cost Calculation

1.3 introduced ChatUsage so you could see how many tokens a call consumed. Useful, but tokens are not what the invoice at the end of the month is denominated in. 1.4 closes that gap with ChatPricing and ChatCost.

Attach a pricing to your ChatOptions, make a call, read back the cost:

ChatPricing pricing = new ChatPricing(
    new BigDecimal("3.00"),       // input price per 1M tokens
    new BigDecimal("0.30"),       // cached-input price per 1M tokens (optional)
    new BigDecimal("15.00"),      // output price per 1M tokens (includes reasoning)
    Currency.getInstance("USD")); // optional; purely for presentation.

ChatOptions options = ChatOptions.newBuilder()
    .pricing(pricing)
    .build();

String response = service.chat("Explain quantum computing", options);

ChatCost cost = options.getLastCost();
System.out.println("Input cost:        " + cost.inputCost());
System.out.println("Cached input cost: " + cost.cachedInputCost());
System.out.println("Output cost:       " + cost.outputCost());
System.out.println("Total cost:        " + cost.totalCost() + " " + cost.currency());

Prices are expressed per one million tokens to match how providers publish their rate sheets. There are deliberately no built-in rate presets; provider rates drift and differ per model tier, so you look up the current numbers for your chosen model and pass them in. The optional currency is passed through to ChatCost for display; it does not affect any arithmetic, so use whatever unit you supplied the prices in.

The cachedInputTokenPrice is optional. When null, cached tokens are billed at the regular input rate. Set it explicitly to reflect the provider's cache-read discount (Anthropic charges roughly 10% of the input rate for cache reads, OpenAI and Google roughly 25%). Reasoning tokens are always billed at the output rate, consistent with how providers invoice them.

If you want the full positional constructor to be a bit less ceremonial, there are two factory methods:

ChatPricing simple = ChatPricing.of(new BigDecimal("3.00"), new BigDecimal("15.00"));
ChatPricing withCache = ChatPricing.of(new BigDecimal("3.00"), new BigDecimal("0.30"), new BigDecimal("15.00"));

And if you have a ChatUsage in hand and want the cost ad-hoc without configuring options at all:

ChatCost cost = usage.calculateCost(pricing);

One caveat worth mentioning up front: this is a simplified three-tier scheme (base input, cached input, output) that covers the common case. Provider-specific billing axes like Anthropic's 5-minute and 1-hour cache-write premiums are not modeled and may cause under-counting for workloads that rely heavily on explicit prompt caching. For strict accuracy, reconcile against the provider's own billing API. For "roughly what did that call cost me" it is good enough.

Budget Cap

Cost visibility is nice. Cost protection is nicer. 1.4 also lets you attach a cumulative-cost ceiling alongside the pricing so runaway spend on a given ChatOptions instance gets stopped rather than logged after the fact:

ChatOptions options = ChatOptions.newBuilder()
    .pricing(pricing, new BigDecimal("1.00")) // hard stop at $1.00
    .build();

while (hasMoreWork()) {
    try {
        service.chat(next(), options);
    } catch (AIBudgetExceededException e) {
        log.warn("Spent {} of {} {} — stopping", e.getTotalCost(), e.getMaxTotalCost(), e.getCurrency());
        break;
    }
}

The cap is checked before each call using the accumulated ChatOptions.getTotalCost(). It is a soft ceiling: the call that pushes the running total at or over the cap still completes and is billed; the next call is refused with AIBudgetExceededException. That keeps the behavior predictable; the alternative of estimating an upcoming call's cost before dispatching it would require knowing the output token count in advance, which of course you don't.

After you have caught the exception, you can call options.resetBudget() to zero the counter and start a fresh window on the same instance, or switch to a different ChatOptions instance, or even fail over to a different AIService (e.g. a cheaper model) to continue processing.

Cached Input Tokens

While we are on the subject of prompt caches, ChatUsage has gained a fourth field: cachedInputTokens().

ChatUsage usage = options.getLastUsage();
System.out.println("Input tokens:         " + usage.inputTokens());
System.out.println("Cached input tokens:  " + usage.cachedInputTokens()); // subset of inputTokens
System.out.println("Output tokens:        " + usage.outputTokens());
System.out.println("Reasoning tokens:     " + usage.reasoningTokens());   // subset of outputTokens
System.out.println("Total tokens:         " + usage.totalTokens());

It reports the subset of input tokens that was served from the provider's prompt cache. This is the number that drives the cheaper cachedInputCost on ChatCost, and it is useful on its own too; a low cache-hit ratio on a workload that should mostly be reused content is a good signal that your system prompts are drifting or the provider's cache TTL has elapsed. As with the other fields, a value of -1 means the provider did not report it.

Reasoning Effort

Modern frontier models (GPT-5, Claude extended thinking, Gemini thinking, Grok reasoning) all let you tune how many tokens they should spend on internal reasoning before answering. The knobs are called different things across providers; in OmniHai they live behind a single enum:

ChatOptions options = ChatOptions.newBuilder()
    .reasoningEffort(ReasoningEffort.HIGH)
    .build();

String answer = service.chat("Prove the Pythagorean theorem.", options);

The available levels are AUTO (the default, defers to the provider's own default), NONE (actively disable reasoning where supported, for minimum cost and latency), LOW (~20% of budget), MEDIUM (~50% of budget), HIGH (~80% of budget), and XHIGH (~95% of budget). Providers that do not support a given level map to the closest equivalent, so you can leave the same ChatOptions in place while switching the underlying provider.

Higher levels typically improve answer quality on hard problems (math, multi-step planning, non-trivial code) at the cost of more tokens and latency. On trivial prompts they just spend money without any measurable upside, so do not set HIGH or XHIGH as the default for all your calls :) Keep in mind that a higher effort may also require a correspondingly higher maxTokens to avoid truncated responses.

Portable JSON for ChatOptions

ChatOptions has been Serializable since day one, which is enough to stash it in an HTTP session. For portable storage, REST payloads, JSON columns, audit logs, or cross-service transport, Java serialization is not what you want. 1.4 adds an explicit JSON form:

String json = options.toJson();
ChatOptions restored = ChatOptions.fromJson(json);

All user-facing settings are included: system prompt, JSON schema, temperature, maxTokens, reasoning effort, topP, web search location, pricing, maxTotalCost, maxHistory, and the full conversation history (including any recorded uploaded file references). Null or unset fields are omitted for a compact payload. Runtime state, the last usage and the cumulative total cost, is deliberately not serialized; a restored instance starts with a fresh zero total cost counter.

Round-tripping a shared default constant (DEFAULT, CREATIVE, DETERMINISTIC) yields a mutable copy, equivalent to calling copy(). That way you do never accidentally end up with a restored instance that still rejects mutations because it was derived from an immutable template.

Default Models

Under the hood, default model identifiers per provider have been refreshed to match the current state of technology. The exact identifiers are documented on the GitHub README. If you were relying on the provider default, you get the newer model automatically on upgrade; if you were pinning a specific model, nothing changes for you.

Getting 1.4

Non-Maven users: download the OmniHai 1.4 JAR and drop it in /WEB-INF/lib the usual way, replacing the older version if any.

Maven users:

<dependency>
    <groupId>org.omnifaces</groupId>
    <artifactId>omnihai</artifactId>
    <version>1.4</version>
</dependency>

Give It a Try

As always, feedback and contributions are welcome on GitHub. If you run into any issues, open an issue. Pull requests are welcome too.