Showing posts with label Performance. Show all posts
Showing posts with label Performance. Show all posts

Wednesday, October 14, 2015

Logging duration of createView, buildView and renderView

Sometimes you'd like to measure how long JSF is taking to create, build and render the view. You can achieve this with a custom ViewDeclarationLanguage wrapper like below:

package com.example;

import java.io.IOException;
import java.util.logging.Logger;

import javax.faces.component.UIViewRoot;
import javax.faces.context.FacesContext;
import javax.faces.view.ViewDeclarationLanguage;
import javax.faces.view.ViewDeclarationLanguageWrapper;

public class VdlLogger extends ViewDeclarationLanguageWrapper {

    private static final Logger logger = Logger.getLogger(VdlLoggerFactory.class.getName());

    private ViewDeclarationLanguage wrapped;

    public VdlLogger(ViewDeclarationLanguage wrapped) {
        this.wrapped = wrapped;
    }

    @Override
    public UIViewRoot createView(FacesContext context, String viewId) {
        long start = System.nanoTime();
        UIViewRoot view = super.createView(context, viewId);
        long end = System.nanoTime();
        logger.info(String.format("create %s: %.6fms", viewId, (end - start) / 1e6));
        return view;
    }

    @Override
    public void buildView(FacesContext context, UIViewRoot view) throws IOException {
        long start = System.nanoTime();
        super.buildView(context, view);
        long end = System.nanoTime();
        logger.info(String.format("build %s: %.6fms", view.getViewId(), (end - start) / 1e6));
    }

    @Override
    public void renderView(FacesContext context, UIViewRoot view) throws IOException {
        long start = System.nanoTime();
        super.renderView(context, view);
        long end = System.nanoTime();
        logger.info(String.format("render %s: %.6fms", view.getViewId(), (end - start) / 1e6));
    }

    @Override
    public ViewDeclarationLanguage getWrapped() {
        return wrapped;
    }

}

In order to get it to run, create the below factory:

package com.example;

import javax.faces.view.ViewDeclarationLanguage;
import javax.faces.view.ViewDeclarationLanguageFactory;

public class VdlLoggerFactory extends ViewDeclarationLanguageFactory {

    private ViewDeclarationLanguageFactory wrapped;

    public VdlLoggerFactory(ViewDeclarationLanguageFactory wrapped) {
        this.wrapped = wrapped;
    }

    @Override
    public ViewDeclarationLanguage getViewDeclarationLanguage(String viewId) {
        return new VdlLogger(wrapped.getViewDeclarationLanguage(viewId));
    }

    @Override
    public ViewDeclarationLanguageFactory getWrapped() {
        return wrapped;
    }

}

And register it as below in faces-config.xml:

<factory>
    <view-declaration-language-factory>com.example.VdlLoggerFactory</view-declaration-language-factory>
</factory>

The createView() is the step of creating the concrete UIViewRoot instance based on <f:view> and <f:metadata> tags present in the view. When using Facelets (XHTML) as view, during this step all associated XHTML files will be parsed ("compiled") by the SAX parser and cached for a time as defined in the context parameter javax.faces.FACELETS_REFRESH_PERIOD. So it may happen that this step is one time relatively slow and the other time blazing fast. Use a value of -1 to cache them infinitely. When using Mojarra 2.2.11 or newer and the context parameter javax.faces.PROJECT_STAGE is already set to its default value of Production, then the refresh period already defaults to -1.

The buildView() is the step of populating the JSF component tree (the getChildren() of UIViewRoot) based on the view composition. During this step, all taghandlers (JSTL and friends) are executed and all EL expressions in those taghandlers and component's id and binding attributes are evaluated (for detail, see also JSTL in JSF2 Facelets... makes sense?). So if backing beans are constructed for first time during view build time and run some expensive business logic during e.g. @PostConstruct, then it may happen that this step is time consuming.

The renderView() is the step of generating the HTML output based on JSF component tree and the model, starting with UIViewRoot#encodeAll(). So if backing beans are constructed for first time during view render time and run some expensive business logic during e.g. @PostConstruct, then it may happen that this step is time consuming.

If a JSF page is loading slow in browser even though the above measurements run in milliseconds, then chances are big that the generated HTML DOM tree is simply big/bloated, and/or that the webbrowser is incapable of dealing with big HTML DOM trees, and/or that some JavaScript is inefficient on big HTML DOM trees. You'd then best profile the performance in the client side instead. For example, particularly Internet Explorer is slow with tables on big HTML DOM trees, and jQuery is slow with commaseparated selectors on big HTML DOM trees. Solutions would then be introducing filtering/pagination, and splitting into multiple selectors and passing each through a (callback) function, respectively.

Monday, June 2, 2014

OmniFaces 1.8.3 released!

OmniFaces 1.8.3 has finally been released!

Also this release had some unscheduled delay for various reasons. Great programmers are also just humans with a "life" next to all the development work. I personally had after all a lot more time needed to acclimatize myself to the Netherlands after having lived in Curaçao for almost 6 years (still don't really feel home here outside the working hours, still want to go back once the kids grow out the house). Arjan had among others also some unforeseen issues with his new home.

As usual, in the What's new page of the showcase site you can find an overview of all what's been added/changed/fixed for 1.8. The three most useful additions are the <o:deferredScript>, <o:massAttribute> and @Eager.

Installation

Non-Maven users: download OmniFaces 1.8.3 JAR and drop it in /WEB-INF/lib the usual way, replacing the older version if any.

Maven users: use <version>1.8.3</version>.

<dependency>
    <groupId>org.omnifaces</groupId>
    <artifactId>omnifaces</artifactId>
    <version>1.8.3</version>
</dependency>

Defer loading and parsing of JavaScript files

If you've ever analyzed the performance of your website using a tool like Google PageSpeed, then you'll probably recognize the recommendation to defer loading and parsing of JavaScript files. Basically, the recommendation is to load JavaScript files only when the browser is finished with rendering of the page. This is to be achieved by dynamically creating a <script> element via document.createElement() during window.onload. Please note that this is not the same as just moving the scripts to the bottom of the page using <h:outputScript target="body">! That would speed up downloading of other resources, but still block the rendering of the HTML in most browsers (read: everything expect IE).

OmniFaces now comes with a <o:deferredScript> component for this very purpose which works just like <h:outputScript> with a library and name attribute.

<h:head>
    ...
    <o:deferredScript library="libraryname" name="resourcename.js" />
</h:head>

You can also use it on for example PrimeFaces scripts, but some additional work needs to be done. For detail, refer this Stack Overflow Question and Answer: Defer loading and parsing of PrimeFaces JavaScript files. This approach has been in production at zeef.com since March 20 (more than 2 months already thus) and has decreased the time to "DOM content loaded" from ~3s to ~1s on a modern client machine.

Set a common attribute on multiple components

The new <o:massAttribute> taghandler allows you to set a common attribute on multiple components. So, instead of for example:

<h:inputText ... disabled="#{someBean.disabled}" />
<h:inputText ... disabled="#{someBean.disabled}" />
<h:inputText ... disabled="#{someBean.disabled}" />
<h:inputText ... disabled="#{someBean.disabled}" />
<h:inputText ... disabled="#{someBean.disabled}" />

You can just do:

<o:massAttribute name="disabled" value="#{someBean.disabled}">
    <h:inputText ... />
    <h:inputText ... />
    <h:inputText ... />
    <h:inputText ... />
    <h:inputText ... />
</o:massAttribute>

The advantage speaks for itself.

Eagerly instantiate a CDI managed bean

When using the standard JSF managed bean facility via @ManagedBean which is since JSF 2.2 semi-official deprecated (not documented as such, but the JSF team is clearly pushing toward it given the total absence of new features around JSF managed bean facility, instead they are all in CDI managed bean facility), it was possible to declare an application scoped JSF managed bean to be eagerly instantiated during application's startup like so:

import javax.faces.bean.ApplicationScoped;
import javax.faces.bean.ManagedBean;

@ManagedBean(eager=true)
@ApplicationScoped
public class Bean {
    // ...
}

However, this isn't possible with standard CDI, not even with the one as available in Java EE 7. So OmniFaces has added the @Eager and @Startup annotations for the very purpose. The @Startup is just a stereotype for @Eager @ApplicationScoped.

So, both beans below are equivalent:

import javax.enterprise.context.ApplicationScoped;
import javax.inject.Named;
import org.omnifaces.cdi.Eager;

@Named
@Eager
@ApplicationScoped
public class Bean {
    // ...
}
import javax.inject.Named;
import org.omnifaces.cdi.Startup;

@Named
@Startup
public class Bean {
    // ...
}

An additional bonus of OmniFaces @Eager is that it not only works on application scoped CDI managed beans, but also on request and session scoped CDI managed beans and on beans annotated with OmniFaces @ViewScoped (thus not the JSF 2.2 one yet, that will come in the upcoming OmniFaces 2.0 which will target JSF 2.2, OmniFaces 1.x is namely JSF 2.0 targeted).

Having @Eager on a request scoped bean may look somewhat strange, but this makes an interesting use case possible: eagerly and asynchronously fetch some data from a DB in the very beginning of the request, long before the FacesServlet is invoked, it runs even before the servlet filters are hit (it's initiated via a ServletRequestListener). Depending on the server hardware used, the available server resources, all code running between the invocation of the first servlet filter and entering the JSF render response, this may give you a time space of 10ms ~ 500ms (or perhaps more if you've some inefficient code in the pipeline ;) ) to fetch some data from DB in a different thread parallel with the HTTP request and thus a speed improvement equivalent to the time the DB needs to fetch the data. Below is an example of how such an approach can look like:

The asynchronous service (this silly example fetches the entire table; just do whatever DB or any other relatively long-lasting service task you want to do as long as the method is annotated with @Asynchronous and you return an AsyncResult as Future; the container will all by itself worry about managing the threads):

package com.example;

import java.util.List;
import java.util.concurrent.Future;
import javax.ejb.AsyncResult;
import javax.ejb.Asynchronous;
import javax.ejb.Stateless;
import javax.persistence.EntityManager;
import javax.persistence.PersistenceContext;

@Stateless
public class MyEntityService {

    @PersistenceContext
    private EntityManager em;

    @Asynchronous
    public Future<List<MyEntity>> asyncList() {
        List<MyEntity> entities = em
            .createQuery("SELECT e FROM MyEntity e", MyEntity.class)
            .getResultList();
        return new AsyncResult<>(entities);
    }

}

The @Eager request scoped bean (note the requestURI attribute, this must exactly match the context-relative request URI without any path fragments and query strings, this example assumes a /test.xhtml page (with a FacesServlet mapping of *.xhtml); wildcards like * are not supported yet, this may come in the future if there's demand)

package com.example;

import java.util.List;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.Future;
import javax.annotation.PostConstruct;
import javax.enterprise.context.RequestScoped;
import javax.inject.Inject;
import javax.inject.Named;
import org.omnifaces.cdi.Eager;

@Named
@Eager(requestURI="/test.xhtml")
@RequestScoped
public class MyEagerRequestBean {

    private Future<List<MyEntity>> entities;

    @Inject
    private MyEntityService service;

    @PostConstruct
    public void init() {
        entities = service.asyncList();
    }

    public List<MyEntity> getEntities() {
        try {
            return entities.get();
        } catch (InterruptedException e) {
            Thread.currentThread().interrupt();
            throw new FacesException(e);
        } catch (ExecutionException e) {
            throw new FacesException(e);
        }
    }

}

This way, when you request /test.xhtml with something like this:

<h:dataTable value="#{myEagerRequestBean.entities}" var="entity">
    <h:column>#{entity.property}</h:column>
</h:dataTable>

... then the above bean will be constructed and initialized far before the FacesServlet is invoked. Note that this thus also means that the FacesContext is not available inside the @PostConstruct! From that point on, both JSF and JPA will do their jobs simultaneously in separate threads until JSF calls the getter for the first time. The JSF thread (the HTTP request thread) will then block until JPA has returned the result, or perhaps it's already returned at that moment and then JSF can just advance immediately without waiting for JPA.

An overview of all additions/changes/bugfixes in OmniFaces 1.8

Taken over from the What's new? page on showcase:

Added in OmniFaces 1.8

  • WebXml#getFormErrorPage() to get web.xml configured location of the FORM authentication error page
  • <o:deferredScript> which is capable of deferring JavaScript resources to window.onload
  • Faces#addResponseCookie() got 2 new overloaded methods whereby domain and path defaults to current request domain and current path
  • Components#isRendered() which also checks the rendered attribute of all parents of the given component
  • <o:massAttribute> which sets the given attribute on all nested components
  • FacesMessageExceptionHandler which sets any caught exception as a global FATAL faces message
  • <o:cache> has new disabled attribute to temporarily disable the cache and pass-through children directly
  • @Eager annotation to eagerly instantiate request-, view-, session- and application scoped beans
  • <o:viewParam> skips converter for null model values so that query string doesn't get polluted with an empty string
  • Small amount of utility methods and classes, e.g. method to check CDI annotations recursively in stereotypes, shortcut method to obtain VDL, etc

Changed in OmniFaces 1.8

  • CombinedResourceHandler now also recognizes and combines <o:deferredScript>
  • UnmappedResourceHandler now also recognizes PrimeFaces dynamic resources using StreamedContent

Fixed in OmniFaces 1.8

  • Assume RuntimeException in BeanManager#init() as CDI not available (fixes deployment error on WAS 8.5 without CDI enabled)
  • Use "-" (hyphen) instead of null as default option value to workaround noSelectionOption fail with GenericEnumConverter
  • <o:param> shouldn't silently convert the value to String (fixes e.g. java.util.Date formatting fail in <o:outputFormat>)
  • Fixed javax.enterprise.inject.AmbiguousResolutionException in subclassed @FacesConverter and @FacesValidator
  • <o:messages> failed to find the for component when it's not in the same parent
  • <o:conditionalComment> shouldn't XML-escape the if value, enabling usage of & character
  • UnmappedResourceHandler broke state saving when partial state saving is turned off
  • CombinedResourceHandler didn't properly deal with MyFaces-managed resources

Maven download stats

Here are the Maven download stats:

  • January 2014: 3537
  • February 2014: 3580
  • March 2014: 3892
  • April 2014: 3572
  • May 2014: 3971

Below is the version pie of May 2014:

Last but not least

For the case you missed it: OmniFaces repo, wiki and issue tracking (basically: everything) has moved from Google Code to GitHub, along with a "brand new" homepage in GitHub style at omnifaces.org. The downloads will from now just point directly to Maven via links at the homepage.

Wednesday, July 31, 2013

Serving multiple images from database as a CSS sprite

Introduction

In the first public beta version of ZEEF which was somewhat thrown together (first get the minimum working using standard techniques, then review, refactor and improve it), all favicons were served individually. Although they were set to be agressively cached (1 year, whereby a reload is when necessary forced by the timestamp-in-query-string trick with the last-modified timestamp of the link), this resulted in case of an empty cache in a ridiculous amount of HTTP requests on a subject page with relatively a lot of links, such as Curaçao by Bauke Scholtz:

Yes, 209 image requests of which 10 are not for favicons, which nets as 199 favicon requests. Yes, that much links are currently on the Curaçao subject. The average modern webbrowser has only 6~8 simultaneous connections available on a specific domain. That's thus a huge queue. You can see it in the screenshot, it took on an empty cache nearly 5 seconds to get them all (on a primed cache, it's less than 1 second).

If you look closer, you'll see that there's another problem with this approach: links which doesn't have a favicon re-requests the very same default favicon again and again with a different last-modified timestamp of the link itself, ending up in copies of exactly same image in the browser cache. Also, links from the same domain which share the same favicon, have their favicons duplicated this way. In spite of the agressive cache, this was simply too inefficient.

Converting images to common format and size

The most straightforward solution would be to serve all those favicons as a single CSS sprite and make use of CSS background-position to reference the right favicon in the sprite. This however requires that all favicons are first parsed and converted to a common format and size which allows easy manipulation by standard Java 2D API (ImageIO and friends) and easy generation of the CSS sprite image. PNG was chosen as format as that's the most efficient and lossless format. 16x16 was chosen as default size.

As first step, a favicon parser was created which verifies and parses the scraped favicon file and saves every found image as PNG (the ICO format can store multiple images, usually each with a different dimension, e.g. 16x16, 32x32, 64x64, etc). For this, Image4J (a mavenized fork with bugfix) has been of a great help. The original Image4J had only a minor bug, it ran in an infinite loop on favicons with broken metadata, such as this one. This was fixed by vijedi/image4j. However, when an ICO file contained multiple images, this fix discarded all images, instead of only the broken one. So, another bugfix was done on top of that (which by the way just leniently returned the "broken" image — in fact, only the metadata was broken, not the image content itself). Every single favicon will now be parsed by ICODecoder and BMPDecoder of Image4J and then ImageIO#read() of standard Java SE API in this sequence. Whoever returned the first non-null BufferedImage(s) without exceptions, this will be used. This step also made us able to completely bypass the content-type check which we initially had, because we discovered that a lot of websites were doing a bad job in this, some favicons were even served as text/html which caused false negatives.

As second step, if the parsing of a favicon resulted in at least one BufferedImage, but no one was in 16x16 dimension, then it will be created based on the firstnext dimension which is resized back to 16x16 with help of thebuzzmedia/imgscalr which yielded high quality resizings.

Finally all formats are converted to PNG and saved in the DB (and cached in the local disk file system).

Serving images as CSS sprite

For this a simple servlet was been used which does basically ultimately the following in doGet() (error/cache checking omitted for simplicity):


Long pageId = Long.valueOf(request.getPathInfo().substring(1));
Page page = pageService.getById(pageId);
long lastModified = page.getLastModified();
byte[] content = faviconService.getSpriteById(pageId, lastModified);

if (content != null) { // Found same version in disk file system cache.
    response.getOutputStream().write(content);
    return;
}

Set<Long> faviconIds = new TreeSet<>();
faviconIds.add(0L); // Default favicon, appears as 1st image of sprite.
faviconIds.addAll(page.getFaviconIds());

int width = Favicon.DEFAULT_SIZE; // 16px.
int height = width * faviconIds.size();

BufferedImage sprite = new BufferedImage(width, height, BufferedImage.TYPE_INT_ARGB);
Graphics2D graphics = sprite.createGraphics();
graphics.setBackground(new Color(0xff, 0xff, 0xff, 0)); // Transparent.
graphics.fillRect(0, 0, width, height);

int i = 0;

for (Long faviconId : faviconIds) {
    Favicon favicon = faviconService.getById(faviconId); // Loads from disk file system cache.
    byte[] content = favicon.getContent();
    BufferedImage image = ImageIO.read(new ByteArrayInputStream(content));
    graphics.drawImage(image, 0, width * i++, null);
}

ByteArrayOutputStream output = new ByteArrayOutputStream();
ImageIO.write(sprite, "png", output);
content = output.toByteArray();
faviconService.saveSprite(pageId, lastModified, content); // Store in disk file system cache.
response.getOutputStream().write(content);

To see it in action, you can get all favicons of the page Curaçao by Bauke Scholtz (which has page ID 18) as CSS sprite on the following URL: https://zeef.com/favicons/page/18.

Serving the CSS file containing sprite-image-specific selectors

In order to present the CSS sprite images at the right places, we should also have a simple servlet which generates the desired CSS stylesheet file containing sprite-image-specific selectors with the right background-position. The servlet should basically ultimately do the following in doGet() (error/cache checking omitted to keep it simple):


Long pageId = Long.valueOf(request.getPathInfo().substring(1));
Page page = pageService.getById(pageId);

Set<Long> faviconIds = new TreeSet<>();
faviconIds.add(0L); // Default favicon, appears as 1st image of sprite.
faviconIds.addAll(page.getFaviconIds());

long lastModified = page.getLastModified().getTime();
int height = Favicon.DEFAULT_SIZE; // 16px.

PrintWriter writer = response.getWriter();
writer.printf("[class^='favicon-']{background-image:url('../page/%d?%d')!important}", 
    pageId, lastModified);
int i = 0;

for (Long faviconId : faviconIds) {
    writer.printf(".favicon-%s{background-position:0 -%spx}", faviconId, height * i++);
}

To see it in action, you can get the CSS file of the page Curaçao by Bauke Scholtz (which has page ID 18) on the following URL: https://zeef.com/favicons/css/18.

Note that the background-image URL has the page's last modified timestamp in the query string which should force a browser reload of the sprite whenever a link has been added/removed in the page. The CSS file itself has also such a query string as you can see in HTML source code of the ZEEF page, which is basically generated as follows:


<link id="favicons" rel="stylesheet" 
    href="//zeef.com/favicons/css/#{zeef.page.id}?#{zeef.page.lastModified.time}" />

Also note that the !important is there to overrule the default favicon for the case the serving of the CSS sprite failed somehow. The default favicon is specified in general layout CSS file layout.css as follows:


#blocks .link.block li .favicon,
#blocks .link.block li [class^='favicon-'] {
    position: absolute;
    left: -7px;
    top: 4px;
    width: 16px;
    height: 16px;
}

#blocks .link.block li [class^='favicon-'] {
    background-image: url("#{resource['zeef:images/default_favicon.png']}");
}

Referencing images in HTML

It's rather simple, the links were just generated in a loop whereby the favicon image is represented via a plain HTML <span> element basically as follows:


<a id="link_#{linkPosition.id}" href="#{link.targetURL}" title="#{link.defaultTitle}">
    <span class="favicon-#{link.faviconId}" />
    <span class="text">#{linkPosition.displayTitle}</span>
</a>

The HTTP requests on image files have been reduced from 209 to 12 (note that 10 non-favicon requests have increased to 11 non-favicon requests due to changes in social buttons, but that's not further related to the matter):

It took on an empty cache on average only half a second to download the CSS file and another half a second to download the CSS sprite. Per saldo, that's thus 5 times faster with 197 connections less! On a primed cache it's even not requested at all. Noted should be that I'm here behind a relatively slow network and that the current ZEEF production server on a 3rd party host isn't using "state of the art" hardware yet. The hardware will be handpicked later on once we grow.

Reloading CSS sprite by JavaScript whenever necessary

When you're logged in as page owner, you can edit the page by adding/removing/drag'n'drop links and blocks. This all takes place by ajax without a full page reload. Whenever necessary, the CSS sprite can during ajax oncomplete be forced to be reloaded by the following script which references the <link id="favicons">:


function reloadFavicons() {
    var $favicons = $("#favicons");
    $favicons.attr("href", $favicons.attr("href").replace(/\?.*/, "?" + new Date().getTime()));
}

Basically, it just updates the timestamp in the query string of the <link href> which in turn forces the webbrowser to request it straight from the server instead of from the cache.

Note that in case of newly added links which do not exist in the system yet, favicons are resolved asynchronously in the background and pushed back via Server-Sent Events. In this case, the new favicon is still downloaded individually and explicitly set as CSS background image. You can find it in the global-push.js file:


function updateLink(data) {
    var $link = $("#link_" + data.id);
    $link.attr("title", data.title);
    $link.find(".text").text(data.text);
    $link.find("[class^='favicon-']").attr("class", "favicon")
        .css("background-image", "url(/favicons/link/" + data.icon + "?" + new Date().getTime() + ")");
    highlight($link);
}

But once the HTML DOM representation of the link or block is later ajax-updated after an edit or drag'n'drop, then it will re-reference the CSS sprite again.

The individual favicon request is also done in "Edit link" dialog. The servlet code for that is not exciting, but for the case you're interested, the URL is like https://zeef.com/favicons/link/354 and all the servlet basically does is (error/cache checking omitted for brevity):


Long linkId = Long.valueOf(request.getPathInfo().substring(1));
Link link = linkService.getById(linkId);
Favicon favicon = faviconService.getById(link.getFaviconId());
byte[] content = favicon.getContent();
response.getWriter().write(content);

Note that individual favicons are not downloaded by their own ID, but instead by the link ID, because a link doesn't necessarily have any favicon. This way the default favicon can easily be returned.

Thursday, September 10, 2009

Webapplication performance tips and tricks

Introduction

Yahoo has a great performance analysis tool in flavor of a Firefox addon: YSlow (yes, you need to install the -also great- Firebug addon first). The YSlow site has already explained all of the best practices in detail here.

Yahoo's explanations are in general clear enough for the average Java EE web application developer, but when the YSlow's Server category comes into the picture, Yahoo unfortunately only gives examples based on Apache HTTP server and PHP and in a few cases also IIS. In this article I'll "translate" the relevant subcategories into the Java EE approach based on Apache Tomcat 6.0. As a bonus, a few more best practices are added and explained in detail.

Back to top

Use a Content Delivery Network

This is the first rule of the YSlow's Server category. Well, the idea is nice, but this is in my opinion not a "must". Having a secondary domain (no, not a subdomain) for pure static content is a more general practice to gain performance in serving static content. A webbrowser is namely restricted to have a certain maximum amount of simultaneous open connections on a single domain. In the older browser versions this is usually limited to 2 and ranges nowadays around 10-15 connections. This can also be changed using a simple regedit (MSIE) or by editing about:config (Firefox). Those kind of tweaks are usually only done by the more advanced users with an above average knowledge of the software they use.

So, to give a broader area of visitors a better performance experience, it may be better to have a secondary domain for pure static content only. E.g. onedomain.com for JSP files and anotherdomain.com for CSS/JS/Flash/etc files. Or of course such a CDN as suggested by Yahoo, but again, a CDN for private static data is in my opinion a bit nonsensicial. After all, if you respect the performance rules for static content the correct way, then the static content will actually only be requested whenever really needed, so this makes a secondary domain or CDN more superfluous. Or you must have a webapplication which needs to serve a lot of non-layout-related images, such as photography.

For 3rd party public static content it's however definitely worth the effort to link it to a CDN which is provided by themselves, if any. For example jQuery offers several CDN hosts. It's a win-win situation for both your server and the client.

Back to top

Add an Expires or a Cache-Control Header

This is the second rule of the YSlow's Server category. A very good point. The Expires header prevents the browser to re-request the same static content (JS/CSS/images/etc) everytime, which is only a waste of the available time, connections and bandwidth. When you're serving static content from public webcontent in Tomcat, then the DefaultServlet is responsible for serving the content. It unfortunately does nothing with the Expires header. Although it supports the Last-Modified headers, this costs effectively a HEAD request which is already one connection and request too much when the content is actually not changed after all. You can however override the DefaultServlet with an own implementation as outlined here. How to do it effectively is already covered by the earlier FileServlet article at this blog. This servlet is a well suited solution for the second, third as well as the fourth rule of the YSlow's Server category.

About the cache-control header for dynamic content, the general practice is that we just want to avoid caching of dynamic content, especially the pages containing forms or the pages in restricted area. You can do that by adding the following response headers to the base controller Servlet or Filter of your webapplication:


    ...

    response.setHeader("Cache-Control", "no-cache, no-store, must-revalidate"); // HTTP 1.1.
    response.setHeader("Pragma", "no-cache"); // HTTP 1.0.
    response.setDateHeader("Expires", 0); // Proxies.

    ...

There is a little story behind the no-store and must-revalidate attributes of the cache-control header: some webbrowsers (including Firefox) doesn't cache the page when those attributes are omitted! According to the HTTP specification only the no-cache should have been sufficient. But OK, now we at least have the 'magic' three headers which should work for all decent webbrowsers and proxies.

Back to top

Use Query String with a timestamp to force re-request

The Expires header is useful, but .. with a (too) far-future Expires header, the client won't check for any updates on the static resource anymore until the expire date has passed, or you clear the browser cache, or you do a hard-refresh (CTRL+F5)! A common practice is then to append an unique query string to the URL of the static content denoting a timestamp of the last file modification or the server startup time, so that the browser is forced to re-request it whenever the query string changes.

Determining the last modification time on every request is more expensive than just determining the server startup time only once in application's lifetime. It is generally sufficient to do so. Whenever the server restarts, the browser will send a HEAD request to check if there are any updates. Assuming that your server doesn't restart every minute or so, this doesn't harm that much. Here's an example of how to do it using a ServletContextListener:

package mypackage;

import javax.servlet.ServletContextEvent;
import javax.servlet.ServletContextListener;

/**
 * Configure the webapplication context. This is to be placed in the application scope.
 * As far now this example only sets the startup time.
 * @author BalusC
 * @see http://balusc.blogspot.com/2009/09/webapplication-performance-tips-and.html
 */
public class Config implements ServletContextListener {

    // Constants ----------------------------------------------------------------------------------

    private static final String CONFIG_ATTRIBUTE_NAME = "config";

    // Properties ---------------------------------------------------------------------------------

    private long startupTime;

    // Actions ------------------------------------------------------------------------------------

    /**
     * Obtain startup time and put Config itself in the application scope.
     * @see ServletContextListener#contextInitialized(ServletContextEvent)
     */
   public void contextInitialized(ServletContextEvent event) {
        this.startupTime = System.currentTimeMillis() / 1000;
        event.getServletContext().setAttribute(CONFIG_ATTRIBUTE_NAME, this);
    }

    /**
     * @see ServletContextListener#contextDestroyed(ServletContextEvent)
     */
    public void contextDestroyed(ServletContextEvent event) {
        // Nothing to do here.
    }

    // Getters ------------------------------------------------------------------------------------

    /**
     * Returns the startup time associated with this configuration.
     * @return The startup time associated with this configuration.
     */
    public long getStartupTime() {
        return this.startupTime;
    }

}

Just add it as a listener to the web.xml the usual way:


    ...

    <listener>
        <listener-class>mypackage.Config</listener-class>
    </listener>

    ...

Here is an example of how to use it in JSP:


        ...

        <link rel="stylesheet" type="text/css" href="/static/style.css?${config.startupTime}">
        <script type="text/javascript" src="/static/script.js?${config.startupTime}"></script>

        ...

As a side-note, if you're using the aforementioned FileServlet as well, then you can in theory postpone the default expire time more. For example 1 year (365 days):


    ...

    private static final long DEFAULT_EXPIRE_TIME = 31536000000L; // ..ms = 365 days.

    ...

Back to top

Add LastModified timestamp to CSS background images

Appending query string with a timestamp to static CSS files is nice, but .. this doesn't cover the CSS background images! Those counts each as a separate request. If you don't append a timestamp query string to them, then they won't be checked for any updates. How to handle it may differ per environment, so I'll only describe my general approach to give the idea. You might need to finetune it further to suit your environment. I myself use a batch job using YUI Compressor (yes, it's a Java API!) to minify all CSS and JS files before deploy. After getting the minified result, regexp is used to find all background images in the CSS source and File#lastModified() is used to get the last modification timestamp from it and finally the originals will be replaced. Here's a basic example of the Minifier -keep in mind, this may needed to be modified to suit your environment:

package mypackage;

import java.io.Closeable;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStreamReader;
import java.io.OutputStreamWriter;
import java.io.Reader;
import java.io.StringWriter;
import java.io.Writer;
import java.util.HashSet;
import java.util.Set;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

import com.yahoo.platform.yui.compressor.CssCompressor;

/**
 * The Minifier.
 * @author BalusC
 * @see http://balusc.blogspot.com/2009/09/webapplication-performance-tips-and.html
 */
public class Minifier {

    // Actions ------------------------------------------------------------------------------------

    /**
     * Minify all CSS files on given basePath + cssPath to the given basePath + minPath and append
     * lastmodified timestamps to CSS background images relative to the given basePath.
     * @param basePath The base path of static content.
     * @param cssPath The path of all CSS files, relative to the given basePath.
     * @param minPath The path of all minified CSS files, relative to the given basePath.
     * @throws IOException If something fails at I/O level.
     */
    public static void minifyCss(String basePath, String cssPath, String minPath) throws IOException {
        for (File cssFile : new File(basePath + cssPath).listFiles()) {
            if (cssFile.isFile()) {
                File minFile = new File(basePath + minPath, cssFile.getName());
                minifyCss(basePath, cssFile, minFile);
            }
        }
    }

    /**
     * Minify given cssFile to the given minFile and append lastmodified timestamps to CSS
     * background images relative to the given basePath.
     * @param basePath The base path of static content.
     * @param cssFile The CSS file to be minified.
     * @param minFile The minified CSS file.
     * @throws IOException If something fails at I/O level.
     */
    public static void minifyCss(String basePath, File cssFile, File minFile) throws IOException {
        Reader reader = null;
        Writer writer = null;

        try {
            // Read original CSS file.
            reader = new InputStreamReader(new FileInputStream(cssFile), "UTF-8");

            // Minify original CSS file.
            StringWriter stringWriter = new StringWriter();
            new CssCompressor(reader).compress(stringWriter, -1);
            String line = stringWriter.toString();

            // Find all CSS background images.
            Matcher matcher = Pattern.compile("url\\([\'\"]?([/\\w\\.]*)[\'\"]?\\)").matcher(line);
            Set<String> imagePaths = new HashSet<String>();
            while (matcher.find()) {
                imagePaths.add(matcher.group(1));
            }

            // Append lastmodified timestamps to CSS background images and replace originals.
            for (String imagePath : imagePaths) {
                long lastModified = new File(basePath + imagePath).lastModified() / 1000;
                line = line.replace(imagePath, imagePath + "?" + lastModified);
            }

            // Write minified CSS file.
            writer = new OutputStreamWriter(new FileOutputStream(minFile), "UTF-8");
            writer.write(line);
        } finally {
            close(writer);
            close(reader);
        }

        // Dumb sysout, replace by Logger if needed ;)
        System.out.println("Minifying " + cssFile + " to " + minFile + " succeed!");
    }

    // Helpers ------------------------------------------------------------------------------------

    /**
     * Silently close given resource. Any IOException will be printed to stdout.
     * This global method can easily be extracted to your "IOUtil" class, if not already exist.
     * @param resource Resource to be closed.
     */
    private static void close(Closeable resource) {
        if (resource != null) {
            try {
                resource.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
    }

    // Main method --------------------------------------------------------------------------------
    
    /**
     * Just to demonstrate how your batch job thing should use the Minifier.
     */
    public static void main(String... args) throws Exception {
        String basePath = "C:/Workspace/YourProject/WebContent/WEB-INF";
        String cssPath = "/static/css";
        String minPath = cssPath + "/min";
        Minifier.minifyCss(basePath, cssPath, minPath);
    }
    
}
Back to top

Gzip Components

This is the third rule of the YSlow's Server category. Yes, that's also a very good point. Gzip is relatively fast and can save up to 70% of the network bandwidth. For static text content you can just use the aforementioned FileServlet article at this blog. For dynamic text content you'll need to configure the application server so that it uses GZIP compression. This is usually explained in the documentation of the application server in question. In case of Apache Tomcat 6.0 you can find it here. You need to extend the <Connector> element in Tomcat/conf/server.xml with a compression attribute which is set to "on". Here's a basic example (note the last attribute):


    ...

    <Connector
        protocol="HTTP/1.1"
        port="80"
        redirectPort="8443"
        connectionTimeout="20000"
        compression="on" />

    ...

That's all! Restart Tomcat and all dynamic response will be Gzipped. And no, this does not affect the aforementioned FileServlet for static content, you can just keep it as is.

Back to top

Configure ETags

This is the fourth rule of the YSlow's Server category. Again a good point and again also covered by the aforementioned FileServlet article at this blog. The ETags are not needed for dynamic content as they are usually not to be cached.

Back to top

Flush the Buffer Early

This is the fifth rule of the YSlow's Server category. Well, that's also a good point. Flushing the response between </head> and <body>. But that's one of the 0,01% cases where in you can't quickly go around a (cough) scriptlet and thus its use is less or more forgiveable.


        ...

    </head>
    <% response.flushBuffer(); %>
    <body>

        ...

However, in case of Apache Tomcat 6.0 the HTTP connector uses a buffer size of 2KB (2048 bytes) by default which is configureable using the bufferSize attribute. This is generally more than good enough. The average HTML head with the "default" minimum tags (doctype, html, head, meta content type, meta description, base, favicon, CSS file, JS file and title) already accounts 1 up to 1.5KB in size. In any way, in one of my last webapps I have used a slightly modified WhitespaceFilter which removes all whitespace inside the <body> and instantly pre-flushes the stream before the <body>.

Back to top

Use NIO

When your webapplication needs to handle more than around 1.000 concurrent connections, or when your webserver is also used for other purposes than only serving the web, then it's generally better to use non-blocking IO streams instead of blocking IO streams. It scales much better as you don't need one implicitly opened thread per opened IO resource anymore, instead basically all resources are managed by a single thread. This saves the server from a lot of threads and the overhead of controlling them and the exponentially growing performance drop when the amount of concurrent threads (HTTP connections) gets high. You're for performance also not dependent on the amount of available threads anymore, but more on the amount of available heap memory. It can go up to around 20.000 concurrent connections on a single thread instead of around 5.000 concurrent connections on that much threads.

Most decent servers supports NIO, as does Apache Tomcat 6.0 in the HTTP connector. Basically all you need to do is to replace the default protocol attribute of "HTTP/1.1" with "org.apache.coyote.http11.Http11NioProtocol". The Tomcat NIO connector implementation is also known as "Grizzly". In some full fledged Java EE application servers like Sun Glassfish, this is by default turned on.


    ...

    <Connector
        protocol="org.apache.coyote.http11.Http11NioProtocol"
        port="80"
        redirectPort="8443"
        connectionTimeout="20000"
        compression="on" />

    ...

That's basically all! Restart Tomcat and now it will use NIO to handle HTTP connections. Only ensure that you give it enough memory (also in the IDE when developing with it). You can start with 512MB, but 1024MB is better.

Back to top

Copyright - No text of this article may be taken over without explicit authorisation. Only the code is free of copyright. You can copy, change and distribute the code freely. Just mentioning this site should be fair.

(C) September 2009, BalusC

Wednesday, February 18, 2009

FileServlet supporting resume and caching and GZIP

WARNING - OUTDATED CONTENT!

Since OmniFaces 2.2, the below file servlet has been reworked, modernized and refactored into a highly reusable abstract org.omnifaces.servlet.FileServlet class in JSF utility library OmniFaces.

Introduction

In the almost 2 year old FileServlet and ImageServlet articles you can find basic examples of a download servlet and an image servlet. It does in fact nothing more than obtaining an InputStream of the desired resource/file and writing it to the OutputStream of the HTTP response along with a set of important response headers. It does not support resumes and effective caching of client side data.

If one downloaded a big file and got network problems on 99% of the file, one wouldn't be happy to discover the need to download the complete file again after getting network back. If a browser decides to check the cached images for changes, it would send a HEAD request to determine under each the unique file identifier and its timestamp or it would send a conditional GET request to determine the response status. If the image isn't changed according to the server response, the client won't re-request the image again to save the network bandwidth and other efforts.

You could leverage the task to a default servlet of the webcontainer/appserver you're using, but most of them doesn't implement all of the performance enhancements, so does for example Tomcat's DefaultServlet not support the Expires header.

Back to top

Resume downloads

To enable download resumes, the server have to send at least the Accept-Ranges, ETag and Last-Modified response headers to the client along with the file.

The Accept-Ranges response header with the value "bytes" informs the client that the server supports byte-range requests. With this the client could request for a specific byte range using the Range request header.

The ETag response header should contain a value which represents an unique identifier of the file in question so that both the server and the client can identify the file. You can use a combination of the file name, file size and file modification timestap for this. Some servers hauls this combination through a MD5 function to get an unique 32 character hexadecimal string. But this is not necessarily unique because two different strings could generate the same MD5 hash, so we won't use it here. The client could resend the obtained ETag back to the server for validation using the If-Match or If-Range request headers.

The Last-Modified response header should contain a date which represents the last modification timestamp of the file as it is at the server side. The client could resend the obtained timestamp back to the server for validation using the If-Unmodified-Since or If-Range request headers. Important note: keep in mind that the timestamp accuracy in server side Java is in milliseconds while the accurancy of the Last-Modified header is in seconds. In Java code you should add 1 second (1000ms) to the value of the If-* request headers to bridge this difference before validation.

Whenever the client sends a partial GET request with a Range request header to the server, then server should intercept on the conditional GET request headers (all headers starting with If) and handle accordingly. Whenever the If-Match or If-Unmodified-Since conditions are negative, the server should send a 412 "Precondition Failed" response back without any content. Whenever the If-Range condition is negative, then the server should ignore the Range header and send the full file back. Whenever the Range header is in invalid format, then the server should send a 416 "Requested Range Not Satisfiable" response back without any content.

If a partial GET request with a valid Range header is sent by the client, then the server should send the specific byte range(s) back as a 206 "Partial Content" response.

Back to top

Client side caching

The principle is the same as with resume downloads, with the only difference that no Range request header is been sent to the server. The server only have to check and validate any conditional GET request headers and respond accordingly. Usually those are the If-None-Match or If-Modified-Since request headers. The client could also send a HEAD request (for which the server should respond exactly like a GET, but completely without content) and determine the obtained ETag and Last-Modified response headers itself.

Whenever the If-None-Match or If-Modified-Since conditions are positive, the server should send a 304 "Not Modified" response back without any content. If this happens, then the client is allowed to use the content which is already available in the client side cache.

Further on you can use the Expires response header to inform the client how long to keep the content in the client side cache without firing any request about that, even no HEAD requests.

Back to top

GZIP compression

To save more network bandwitch, we could compress text files (text/javascript, text/css, text/xml, text/csv, etcetera) with GZIP. Generally you can save up to 70% of network bandwidth by compressing text files with GZIP. We only need to check if the client accepts GZIP encoding by checking if the Accept-Encoding header contains "gzip". If this is true, and the client is requesting the full file, then the full text file will be compressed. Statistics learn that about 90% of the browsers supports GZIP.

This may also be possible for all files other than text, but as it usually concerns images and another kinds of (large) binary files, it may unnecessarily generate too much overhead to (de)compress them.

Back to top

The Code

OK, enough boring technical background blah, now on to the code!

This fileservlet does everything what it should do based on the request headers as described above. It also supports multipart byte requests (the client could send multiple ranges commaseparated along with the Range header). The whole stuff is targeted on at least Java EE 5 and developed and tested in Eclipse 3.4 with Tomcat 6. It is tested with different webbrowsers (FireFox2/3, IE6/7/8, Opera8/9, Safari2/3 and Chrome) and also with a plain vanilla Java Application using URLConnection.

You can use it for any file types: binary files, text files, images, etcetera. When the requested file is a text file or an image or when its content type is covered by the Accept request header of the client, then it will be displayed inline, otherwise it will be sent as an attachment which will pop up a 'save as' dialogue.

It's almost 485 lines of code of which the nearly half are less or more rudimentary due to comments (read them all though), long-code line breaks and blank lines. You can just copy'n'paste and run it. You're free to make changes whenever needed as long as it's not for commercial use.

/*
 * net/balusc/webapp/FileServlet.java
 *
 * Copyright (C) 2009 BalusC
 *
 * This program is free software: you can redistribute it and/or modify it under the terms of the
 * GNU Lesser General Public License as published by the Free Software Foundation, either version 3
 * of the License, or (at your option) any later version.
 * 
 * This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without
 * even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
 * Lesser General Public License for more details.
 * 
 * You should have received a copy of the GNU Lesser General Public License along with this library.
 * If not, see <http://www.gnu.org/licenses/>.
 */

package net.balusc.webapp;

import java.io.Closeable;
import java.io.File;
import java.io.IOException;
import java.io.OutputStream;
import java.io.RandomAccessFile;
import java.net.URLDecoder;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
import java.util.zip.GZIPOutputStream;

import javax.servlet.ServletException;
import javax.servlet.ServletOutputStream;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;

/**
 * A file servlet supporting resume of downloads and client-side caching and GZIP of text content.
 * This servlet can also be used for images, client-side caching would become more efficient.
 * This servlet can also be used for text files, GZIP would decrease network bandwidth.
 *
 * @author BalusC
 * @link https://balusc.omnifaces.org/2009/02/fileservlet-supporting-resume-and.html
 */
public class FileServlet extends HttpServlet {

    // Constants ----------------------------------------------------------------------------------

    private static final int DEFAULT_BUFFER_SIZE = 10240; // ..bytes = 10KB.
    private static final long DEFAULT_EXPIRE_TIME = 604800000L; // ..ms = 1 week.
    private static final String MULTIPART_BOUNDARY = "MULTIPART_BYTERANGES";

    // Properties ---------------------------------------------------------------------------------

    private String basePath;

    // Actions ------------------------------------------------------------------------------------

    /**
     * Initialize the servlet.
     * @see HttpServlet#init().
     */
    public void init() throws ServletException {

        // Get base path (path to get all resources from) as init parameter.
        this.basePath = getInitParameter("basePath");

        // Validate base path.
        if (this.basePath == null) {
            throw new ServletException("FileServlet init param 'basePath' is required.");
        } else {
            File path = new File(this.basePath);
            if (!path.exists()) {
                throw new ServletException("FileServlet init param 'basePath' value '"
                    + this.basePath + "' does actually not exist in file system.");
            } else if (!path.isDirectory()) {
                throw new ServletException("FileServlet init param 'basePath' value '"
                    + this.basePath + "' is actually not a directory in file system.");
            } else if (!path.canRead()) {
                throw new ServletException("FileServlet init param 'basePath' value '"
                    + this.basePath + "' is actually not readable in file system.");
            }
        }
    }

    /**
     * Process HEAD request. This returns the same headers as GET request, but without content.
     * @see HttpServlet#doHead(HttpServletRequest, HttpServletResponse).
     */
    protected void doHead(HttpServletRequest request, HttpServletResponse response)
        throws ServletException, IOException
    {
        // Process request without content.
        processRequest(request, response, false);
    }

    /**
     * Process GET request.
     * @see HttpServlet#doGet(HttpServletRequest, HttpServletResponse).
     */
    protected void doGet(HttpServletRequest request, HttpServletResponse response)
        throws ServletException, IOException
    {
        // Process request with content.
        processRequest(request, response, true);
    }

    /**
     * Process the actual request.
     * @param request The request to be processed.
     * @param response The response to be created.
     * @param content Whether the request body should be written (GET) or not (HEAD).
     * @throws IOException If something fails at I/O level.
     */
    private void processRequest
        (HttpServletRequest request, HttpServletResponse response, boolean content)
            throws IOException
    {
        // Validate the requested file ------------------------------------------------------------

        // Get requested file by path info.
        String requestedFile = request.getPathInfo();

        // Check if file is actually supplied to the request URL.
        if (requestedFile == null) {
            // Do your thing if the file is not supplied to the request URL.
            // Throw an exception, or send 404, or show default/warning page, or just ignore it.
            response.sendError(HttpServletResponse.SC_NOT_FOUND);
            return;
        }

        // URL-decode the file name (might contain spaces and on) and prepare file object.
        File file = new File(basePath, URLDecoder.decode(requestedFile, "UTF-8"));

        // Check if file actually exists in filesystem.
        if (!file.exists()) {
            // Do your thing if the file appears to be non-existing.
            // Throw an exception, or send 404, or show default/warning page, or just ignore it.
            response.sendError(HttpServletResponse.SC_NOT_FOUND);
            return;
        }

        // Prepare some variables. The ETag is an unique identifier of the file.
        String fileName = file.getName();
        long length = file.length();
        long lastModified = file.lastModified();
        String eTag = fileName + "_" + length + "_" + lastModified;
        long expires = System.currentTimeMillis() + DEFAULT_EXPIRE_TIME;


        // Validate request headers for caching ---------------------------------------------------

        // If-None-Match header should contain "*" or ETag. If so, then return 304.
        String ifNoneMatch = request.getHeader("If-None-Match");
        if (ifNoneMatch != null && matches(ifNoneMatch, eTag)) {
            response.setStatus(HttpServletResponse.SC_NOT_MODIFIED);
            response.setHeader("ETag", eTag); // Required in 304.
            response.setDateHeader("Expires", expires); // Postpone cache with 1 week.
            return;
        }

        // If-Modified-Since header should be greater than LastModified. If so, then return 304.
        // This header is ignored if any If-None-Match header is specified.
        long ifModifiedSince = request.getDateHeader("If-Modified-Since");
        if (ifNoneMatch == null && ifModifiedSince != -1 && ifModifiedSince + 1000 > lastModified) {
            response.setStatus(HttpServletResponse.SC_NOT_MODIFIED);
            response.setHeader("ETag", eTag); // Required in 304.
            response.setDateHeader("Expires", expires); // Postpone cache with 1 week.
            return;
        }


        // Validate request headers for resume ----------------------------------------------------

        // If-Match header should contain "*" or ETag. If not, then return 412.
        String ifMatch = request.getHeader("If-Match");
        if (ifMatch != null && !matches(ifMatch, eTag)) {
            response.sendError(HttpServletResponse.SC_PRECONDITION_FAILED);
            return;
        }

        // If-Unmodified-Since header should be greater than LastModified. If not, then return 412.
        long ifUnmodifiedSince = request.getDateHeader("If-Unmodified-Since");
        if (ifUnmodifiedSince != -1 && ifUnmodifiedSince + 1000 <= lastModified) {
            response.sendError(HttpServletResponse.SC_PRECONDITION_FAILED);
            return;
        }


        // Validate and process range -------------------------------------------------------------

        // Prepare some variables. The full Range represents the complete file.
        Range full = new Range(0, length - 1, length);
        List<Range> ranges = new ArrayList<Range>();

        // Validate and process Range and If-Range headers.
        String range = request.getHeader("Range");
        if (range != null) {

            // Range header should match format "bytes=n-n,n-n,n-n...". If not, then return 416.
            if (!range.matches("^bytes=\\d*-\\d*(,\\d*-\\d*)*$")) {
                response.setHeader("Content-Range", "bytes */" + length); // Required in 416.
                response.sendError(HttpServletResponse.SC_REQUESTED_RANGE_NOT_SATISFIABLE);
                return;
            }

            // If-Range header should either match ETag or be greater then LastModified. If not,
            // then return full file.
            String ifRange = request.getHeader("If-Range");
            if (ifRange != null && !ifRange.equals(eTag)) {
                try {
                    long ifRangeTime = request.getDateHeader("If-Range"); // Throws IAE if invalid.
                    if (ifRangeTime != -1 && ifRangeTime + 1000 < lastModified) {
                        ranges.add(full);
                    }
                } catch (IllegalArgumentException ignore) {
                    ranges.add(full);
                }
            }

            // If any valid If-Range header, then process each part of byte range.
            if (ranges.isEmpty()) {
                for (String part : range.substring(6).split(",")) {
                    // Assuming a file with length of 100, the following examples returns bytes at:
                    // 50-80 (50 to 80), 40- (40 to length=100), -20 (length-20=80 to length=100).
                    long start = sublong(part, 0, part.indexOf("-"));
                    long end = sublong(part, part.indexOf("-") + 1, part.length());

                    if (start == -1) {
                        start = length - end;
                        end = length - 1;
                    } else if (end == -1 || end > length - 1) {
                        end = length - 1;
                    }

                    // Check if Range is syntactically valid. If not, then return 416.
                    if (start > end) {
                        response.setHeader("Content-Range", "bytes */" + length); // Required in 416.
                        response.sendError(HttpServletResponse.SC_REQUESTED_RANGE_NOT_SATISFIABLE);
                        return;
                    }

                    // Add range.
                    ranges.add(new Range(start, end, length));
                }
            }
        }


        // Prepare and initialize response --------------------------------------------------------

        // Get content type by file name and set default GZIP support and content disposition.
        String contentType = getServletContext().getMimeType(fileName);
        boolean acceptsGzip = false;
        String disposition = "inline";

        // If content type is unknown, then set the default value.
        // For all content types, see: http://www.w3schools.com/media/media_mimeref.asp
        // To add new content types, add new mime-mapping entry in web.xml.
        if (contentType == null) {
            contentType = "application/octet-stream";
        }

        // If content type is text, then determine whether GZIP content encoding is supported by
        // the browser and expand content type with the one and right character encoding.
        if (contentType.startsWith("text")) {
            String acceptEncoding = request.getHeader("Accept-Encoding");
            acceptsGzip = acceptEncoding != null && accepts(acceptEncoding, "gzip");
            contentType += ";charset=UTF-8";
        } 

        // Else, expect for images, determine content disposition. If content type is supported by
        // the browser, then set to inline, else attachment which will pop a 'save as' dialogue.
        else if (!contentType.startsWith("image")) {
            String accept = request.getHeader("Accept");
            disposition = accept != null && accepts(accept, contentType) ? "inline" : "attachment";
        }

        // Initialize response.
        response.reset();
        response.setBufferSize(DEFAULT_BUFFER_SIZE);
        response.setHeader("Content-Disposition", disposition + ";filename=\"" + fileName + "\"");
        response.setHeader("Accept-Ranges", "bytes");
        response.setHeader("ETag", eTag);
        response.setDateHeader("Last-Modified", lastModified);
        response.setDateHeader("Expires", expires);


        // Send requested file (part(s)) to client ------------------------------------------------

        // Prepare streams.
        RandomAccessFile input = null;
        OutputStream output = null;

        try {
            // Open streams.
            input = new RandomAccessFile(file, "r");
            output = response.getOutputStream();

            if (ranges.isEmpty() || ranges.get(0) == full) {

                // Return full file.
                Range r = full;
                response.setContentType(contentType);

                if (content) {
                    if (acceptsGzip) {
                        // The browser accepts GZIP, so GZIP the content.
                        response.setHeader("Content-Encoding", "gzip");
                        output = new GZIPOutputStream(output, DEFAULT_BUFFER_SIZE);
                    } else {
                        // Content length is not directly predictable in case of GZIP.
                        // So only add it if there is no means of GZIP, else browser will hang.
                        response.setHeader("Content-Length", String.valueOf(r.length));
                    }

                    // Copy full range.
                    copy(input, output, r.start, r.length);
                }

            } else if (ranges.size() == 1) {

                // Return single part of file.
                Range r = ranges.get(0);
                response.setContentType(contentType);
                response.setHeader("Content-Range", "bytes " + r.start + "-" + r.end + "/" + r.total);
                response.setHeader("Content-Length", String.valueOf(r.length));
                response.setStatus(HttpServletResponse.SC_PARTIAL_CONTENT); // 206.

                if (content) {
                    // Copy single part range.
                    copy(input, output, r.start, r.length);
                }

            } else {

                // Return multiple parts of file.
                response.setContentType("multipart/byteranges; boundary=" + MULTIPART_BOUNDARY);
                response.setStatus(HttpServletResponse.SC_PARTIAL_CONTENT); // 206.

                if (content) {
                    // Cast back to ServletOutputStream to get the easy println methods.
                    ServletOutputStream sos = (ServletOutputStream) output;

                    // Copy multi part range.
                    for (Range r : ranges) {
                        // Add multipart boundary and header fields for every range.
                        sos.println();
                        sos.println("--" + MULTIPART_BOUNDARY);
                        sos.println("Content-Type: " + contentType);
                        sos.println("Content-Range: bytes " + r.start + "-" + r.end + "/" + r.total);

                        // Copy single part range of multi part range.
                        copy(input, output, r.start, r.length);
                    }

                    // End with multipart boundary.
                    sos.println();
                    sos.println("--" + MULTIPART_BOUNDARY + "--");
                }
            }
        } finally {
            // Gently close streams.
            close(output);
            close(input);
        }
    }

    // Helpers (can be refactored to public utility class) ----------------------------------------

    /**
     * Returns true if the given accept header accepts the given value.
     * @param acceptHeader The accept header.
     * @param toAccept The value to be accepted.
     * @return True if the given accept header accepts the given value.
     */
    private static boolean accepts(String acceptHeader, String toAccept) {
        String[] acceptValues = acceptHeader.split("\\s*(,|;)\\s*");
        Arrays.sort(acceptValues);
        return Arrays.binarySearch(acceptValues, toAccept) > -1
            || Arrays.binarySearch(acceptValues, toAccept.replaceAll("/.*$", "/*")) > -1
            || Arrays.binarySearch(acceptValues, "*/*") > -1;
    }

    /**
     * Returns true if the given match header matches the given value.
     * @param matchHeader The match header.
     * @param toMatch The value to be matched.
     * @return True if the given match header matches the given value.
     */
    private static boolean matches(String matchHeader, String toMatch) {
        String[] matchValues = matchHeader.split("\\s*,\\s*");
        Arrays.sort(matchValues);
        return Arrays.binarySearch(matchValues, toMatch) > -1
            || Arrays.binarySearch(matchValues, "*") > -1;
    }

    /**
     * Returns a substring of the given string value from the given begin index to the given end
     * index as a long. If the substring is empty, then -1 will be returned
     * @param value The string value to return a substring as long for.
     * @param beginIndex The begin index of the substring to be returned as long.
     * @param endIndex The end index of the substring to be returned as long.
     * @return A substring of the given string value as long or -1 if substring is empty.
     */
    private static long sublong(String value, int beginIndex, int endIndex) {
        String substring = value.substring(beginIndex, endIndex);
        return (substring.length() > 0) ? Long.parseLong(substring) : -1;
    }

    /**
     * Copy the given byte range of the given input to the given output.
     * @param input The input to copy the given range to the given output for.
     * @param output The output to copy the given range from the given input for.
     * @param start Start of the byte range.
     * @param length Length of the byte range.
     * @throws IOException If something fails at I/O level.
     */
    private static void copy(RandomAccessFile input, OutputStream output, long start, long length)
        throws IOException
    {
        byte[] buffer = new byte[DEFAULT_BUFFER_SIZE];
        int read;

        if (input.length() == length) {
            // Write full range.
            while ((read = input.read(buffer)) > 0) {
                output.write(buffer, 0, read);
            }
        } else {
            // Write partial range.
            input.seek(start);
            long toRead = length;

            while ((read = input.read(buffer)) > 0) {
                if ((toRead -= read) > 0) {
                    output.write(buffer, 0, read);
                } else {
                    output.write(buffer, 0, (int) toRead + read);
                    break;
                }
            }
        }
    }

    /**
     * Close the given resource.
     * @param resource The resource to be closed.
     */
    private static void close(Closeable resource) {
        if (resource != null) {
            try {
                resource.close();
            } catch (IOException ignore) {
                // Ignore IOException. If you want to handle this anyway, it might be useful to know
                // that this will generally only be thrown when the client aborted the request.
            }
        }
    }

    // Inner classes ------------------------------------------------------------------------------

    /**
     * This class represents a byte range.
     */
    protected class Range {
        long start;
        long end;
        long length;
        long total;

        /**
         * Construct a byte range.
         * @param start Start of the byte range.
         * @param end End of the byte range.
         * @param total Total length of the byte source.
         */
        public Range(long start, long end, long total) {
            this.start = start;
            this.end = end;
            this.length = end - start + 1;
            this.total = total;
        }

    }

}

In order to get the FileServlet to work, add the following entries to the Web Deployment Descriptor web.xml:

<servlet>
    <servlet-name>fileServlet</servlet-name>
    <servlet-class>net.balusc.webapp.FileServlet</servlet-class>
    <init-param>
        <param-name>basePath</param-name>
        <param-value>/path/to/files</param-value>
    </init-param>
</servlet>

<servlet-mapping>
    <servlet-name>fileServlet</servlet-name>
    <url-pattern>/files/*</url-pattern>
</servlet-mapping>

The basePath value must represent the absolute path to a folder containing all those files. You can of course change the value of the basePath parameter and the url-pattern of the servlet-mapping to your taste.

Here are some basic use examples:

<!-- XHTML or JSP -->
<a href="files/foo.exe">download foo.exe</a>
<a href="files/bar.zip">download bar.zip</a>

<img src="files/pic.jpg" />
<img src="files/logo.gif" />

<!-- JSF -->
<h:outputLink value="files/foo.exe">download foo.exe</h:outputLink>
<h:outputLink value="files/bar.zip">download bar.zip</h:outputLink>
<h:outputLink value="files/#{myBean.fileName}">
    <h:outputText value="download #{myBean.fileName}" />
</h:outputLink>

<h:graphicImage value="files/pic.jpg" />
<h:graphicImage value="files/logo.gif" />
<h:graphicImage value="files/#{myBean.imageFileName}" />

Important note: this servlet example does not take the requested file as request parameter, but just as part of the absolute URL, because a certain widely used browser developed by a team in Redmond would take the last part of the servlet URL path as filename during the 'Save As' dialogue instead of the in the headers supplied filename. Using the filename as part of the absolute URL (and thus not as request parameter) will fix this utterly stupid behaviour. As a bonus, the URL's look much nicer without query parameters.

Back to top

Copyright - GNU Lesser General Public License

(C) February 2009, BalusC