Java 2 Ada

Experience feedback in running a SaaS application

By stephane.carrez

Create a Flexible Architecture

The application architecture can have long term and critical impacts on the performance and growth. It must be flexible to be able to deploy components on dedicated servers when needed. But flexibility has a development cost, a performance cost and on the other hand it is not always necessary. Carefully identifying the components is the key. For each component it is necessary to identify and know how they are used, what is their impact on performance on the overall application. If the architecture is not designed or studied correctly, it can be impossible to reorganize the deployment when issues arise.

Planzone is using a traditional multi-tier J2EE architecture. I have organized the architecture in 5 web applications (WAR) that can be deployed on the same server or on different servers. The web applications have different roles: the core application, API access, batch processing, ... These web applications are deployed on every server and we can activate them easily when necessary.

Deploy Early

Deploying a new application or service should be made early even when only few users will use the application. By going in production sooner rather than later you get the opportunity to see problems when you have less traffic. You can learn and watch how your users are using the service. Last but not least, you are in a real situation and you are forced to identify and solve real problems immediately.

For our service, we launched the beta version of Planzone in December 2007 and let it used to our initial beta users (300 users). At this stage, we had no performance issue but we could collect good feedback on the product, identify missing features and get ideas to improve our service.

Monitor the application from the beginning

Monitoring is the key when the user's growth rate is unknown (and even after!). This must be put in place at the same time the service is deployed. A careful monitoring solution will help to identify early whether the application has performance issues or whether the infrastructure has to be changed because the user's growth requires it.

We put in place a simple monitoring solution based on Cacti and Nagios. But this was not enough because these tools only provide a coarse monitoring view of the application. I put in place a request monitoring within the application to identify the bottlenecks early (I'll describe it in another post).

Optimize when the monitoring says so

The Pareto principle states that 80% of events are caused by 20% of the causes. For software optimization, this 80-20 rule means that by optimizing 20% of the code, we solve 80% of performance problems. The monitoring solution must be used to identify the 20% of pages, or the 20% of database requests, etc which are the most used and are potentially causing a bottleneck. Because the system is in production, the monitoring data is real and not simulated. Therefore you know what to do.

As far as Planzone is concerned, I decided to optimize only one or two pages (over more than 200) and two or three database queries (over more than 180). The choice of which page had to be optimized and when, was defined by the monitoring result. With the team we kept an eye on the monitoring data and we decided to fix performance issues when they seem to appear (one or two times every 6 months).

Update as soon as possible

Optimization allows to solve problems detected by the monitoring. As soon as a solution is found and is functionally validated, updating the production is necessary. Do not wait! Waiting at this stage can aggravate the situation because more users can use the platform and the database will grow (anyway).

With Planzone, we decided to update the service on a regular basis, basically every two months in 2008 and 2009 and each month since the beginning of 2010 (without service interruption!). This helped us a lot in keeping a good quality for the service both on the performance side and on the functional side. Each update contains new (small) features, bug fixes and the performance improvements that are necessary (and no more).

Plan for load spikes

A careful monitoring of the application allows to know the infrastructure usage in terms of CPU, memory and disk loads. Most of the time you will see that the infrastructure is not used at the maximum of its capacity. Users don't use the service at the same time but since you don't control them you may observe intensive use during some periods. If the infrastructure is used at its maximum during normal usage, you have no bandwidth for these intensive usage.

For Planzone we have seen that we often get a load spike every Tuesday and at different hours during the week. Indeed, the load spikes correspond to users who need the service during their business hours. Even during these spikes, the service provides a very good reactivity for users. The load is below 20% in these cases and this gives us room for growth.

Conclusion

From a technical point of view, the architecture, the early deployment, the monitoring, the late optimization and continuous service update were the key in Planzone success.

At beginning of the project we also put in place an internal benchlab infrastructure to make stress and performance measurements. It turns out that production monitoring results were more interesting and valuable than simulating high loads. Our benchlab is now used only for functional validation.

To add a comment, you must be connected. Login to add a comment

Fault tolerant EJB interceptor: a solution to optimistic locking errors and other transient faults

By stephane.carrez 4 comments

Fault tolerance is often necessary in application servers. The J2EE standard defines an interceptor mechanism that can be used to implement the first steps for fault tolerance. The pattern that I present in this article is the solution that I have implemented for the Planzone service and which is used with success for the last two years.

Identify the Fault to recover

The first step is to identify the faults that can be recovered from others. Our application is using MySQL and Hibernate and we have identified the following three transient faults (or recoverable faults).

StaleObjectStateException (Optimistic Locking)

Optimistic locking is a pattern used to optimize database transactions. Instead of locking the database tables and rows when values are updated, we allow other transactions to access these values. Concurrent writes are possible and they must be detected. For this optimistic locking uses a version counter, or a timestamp or state comparison to detect concurrent writes.

When a concurrent write is detected, Hibernate raises a StaleObjectStateException exception. When such exception occurs, the state of objects associated with the current hibernate session is unknown. (See Transactions and Concurrency)

As far as Planzone is concerned, we get 3 exceptions per 10000 calls.

LockAcquisitionException (Database deadlocks)

On the database side, the server can detect deadlock situation and report an error. When a deadlock is detected between two clients, the server generates an error for one client and the second one can proceed. When such error is reported, the client can retry the operation (See InnoDB Lock Modes).

As far as Planzone is concerned, we get 1 or 2 exceptions per 10000 calls.

JDBCConnectionException (Connection failure)

Sometimes the connection to the database is lost either because the database server crashed or because it was restarted due to maintenance reasons. Server crash is rare but it can occur. For Planzone, we had 3 crashes during the last 2 years (one crash every 240 day). During the same period we also had to stop and restart the server 2 times for a server upgrade.

Restarting the call after a database connection failure is a little bit more complex. It is necessary to sleep some time before retrying.

EJB Interceptor

To create our fault tolerant mechanism we use an EJB interceptor which is invoked for each EJB method call. For this the interceptor defines a method marked with the @ArroundInvoke annotation. Its role is to catch the transient faults and retry the call. The example below retries the call at most 10 times.

The EJB interceptor method receives an InvocationContext parameter which allows to have access to the target object, parameters and method to invoke. The proceed method allows to transfer the control to the next interceptor and to the EJB method. The real implementation is a little bit more complex due to logging but the overall idea is here.

class RetryInterceptor {
 @AroundInvoke
  public Object retry(InvocationContext context) throws Exception {
    for (int retry = 0; ; retry++) {
      try {
        return context.proceed();

      } catch (LockAcquisitionException ex) {
         if (retry > 10) {
          throw ex;
        }

     } catch (StaleObjectStateException ex) {
       if (retry > 10) {
        throw ex;
      }

    } catch (final JDBCConnectionException ex) {
      if (retry > 10) {
        throw ex;
      }
      Thread.sleep(500L + retry * 1000L);
   }
 }
}

EJB Interface

For the purpose of this article, the EJB interface is declared as follows. Our choice was to define an ILocal and an IRemote interface to allow the creation of local and remote services.

public interface Service {
    ...
    @Local
    interface ILocal extends Service {
    }

    @Remote
    interface IRemote extends Service {
    }
}

EJB Declaration

The interceptor is associated with the EJB implementation class by using the @Interceptors annotation. The same interceptor class can be associated with several EJBs.

@Stateless(name = "Service")
@Interceptors(RetryInterceptor.class)
public class ServiceBean
  implements Service.ILocal, Service.IRemote {
  ...
}

Testing

To test the solution, I recommend to write a unit test. The unit test I wrote did the following:

  • A first thread executes the EJB method call.
  • The transaction commit operation is overriden by the unit test.
  • When the commit is called, a second thread is activated to simulate the concurrent call before committing.
  • The second thread performs the EJB method call in such a way that it will trigger the StaleObjectStateException when the first thread resumes
  • When the second thread finished, the first thread can perform the real commit and the StaleObjectStateException is raised by Hibernate because the object was modified.
  • The interceptor catches the exception and retries the call which will succeed.

The full design of such test is outside of the scope of this article. It is also specific to each application.

4 comments
To add a comment, you must be connected. Login to add a comment

Apache and JBoss integration with mod_rewrite and mod_proxy

By stephane.carrez 1 comments

The problem I wanted to solve was to be able to use the Apache URL rewrite before forwarding the request to a JBoss server. The Apache and JBoss were already integrated with mod_jk (Deploying a J2EE application behind an Apache server). The URL rewriting rule does not work in that case (at least I was not able to make it work). I investigated the Apache mod_rewrite and its proxy configuration.

First, in your Apache host configuration you have to enable the Apache rewrite module. This is done by the RewriteEngine directive. To make sure that the server name is propagated to JBoss, you have to use the ProxyPreserveHost directive. If you don't do this, JBoss will receive 'localhost' as server name (ie, the servlet request getServerName() method will not return what you expect). You then define your rewrite rule and use the proxy forwarding mode indicated by [P].

 <VirtualHost *:80>
    RewriteEngine On
    ProxyPreserveHost On
    RewriteRule ^/some-path/(.*)$  \
           http://localhost:8080/web-app-deploy-name/$1 \[P\]
   ...
  </VirtualHost>

You have to make sure the Apache modules are available, for this execute these commands:

  sudo a2enmod proxy proxy_http rewrite

Once your configuration files are ready, reload Apache configuration:

  sudo /etc/init.d/apache2 reload

That it!

This mod_rewrite and mod_proxy configuration is very powerful and easier to setup than mod_jk.

1 comments
To add a comment, you must be connected. Login to add a comment

Inserting a file in a JSF view with a custom component

By stephane.carrez 1 comments

Putting all this in a database is of course possible but it means that some administration interface is necessary when we change the blocks. Since the blocks are not intended to be changed too often, I prefered to have some simple solution and use plain text files for the storage.

The goal is now to make a JSF component that reads the file and insert it in the output page.

Step1: Write the UI Component class

The JSF component must inherit from UIComponentBase class. The class must implement the getFamily method to tell to what component family the component belongs.

 package example;
 public class UIFile extends UIComponentBase {
    public String getFamily() {
        return "UIFile";
    }
   ...
 }

The component only needs a file name as parameter. To make it simple, the file component is an attribute that cannot be bound to a value expression (ie, a #{xxx}). Define the file member and the accessor methods:

  public class UIFile extends UIComponentBase {
    protected String file;
    public String getFile() {
        return file;
    }
    public void setFile(String file) {
        this.file = file;
    }
  }

Now, the JSF component is ready and we need a renderer. To make it simple, you can make the component provide and implement the renderer. This is easier but you loose some flexibility. In our case, implementing the rendering in the JSF encodeBegin method is enough. To keep the example small, the file path must be absolute (in reality I'm using a servlet context parameter to configure the directory; this is left as an exercise :-)). To copy the file, I'm using the famous Apache IO library.

   import org.apache.commons.io.IOUtils;
   public class UIFile extends UIComponentBase {
    private static final char[] STOP = new char[0];
    public void encodeBegin(final FacesContext context)
      throws IOException {
        final String name = getFile();
        if (name == null || name.length() == 0) {
            return;
        }
        File f = new File(name);
        if (!f.exists()) {
            return;
        }
        final ResponseWriter rw = context.getResponseWriter();
        rw.writeText(STOP, 0, 0);
        FileInputStream is = null;
        try {
            is = new FileInputStream(f);
            final char[] data = IOUtils.toCharArray(is, "UTF-8");
            rw.write(data);
        } catch (IOException ex) {
        } finally {
            if (is != null) {
                try {
                    is.close();
                } catch (IOException ex) {
                }
            }
        }
   }
 }

For all this to work, there is a trick. The trick is that before writing the file content, we have to close any opened HTML element. This is done by calling writeText with an empty array. After that, the file data is written as is and unescaped.

Step2: Register the component in faces-config.xml

Once your UI component class is written, you have to register it in your faces-config.xml file:

    <component>
        <component-type>example.UIFile</component-type>
        <component-class>example.UIFile</component-class>
    </component>

Step3: Create the taglib

Then, write a taglib file. This is an XML file that tells Facelet the name of the tag in the XHTML file and how to bind that name to the new component.

 <facelet-taglib>
    <namespace>http://www.example.com/tool</namespace>
    <tag>
        <tag-name>insert</tag-name>
        <component>
            <component-type>example.UIFile</component-type>
        </component>
    </tag>
  </facelet-taglib>

Step4: Use the component

The last step is to use our new component in the XHTML file, this is as simple as this:

  <tool:insert file="myfile.html"/>
1 comments
To add a comment, you must be connected. Login to add a comment
  • Page 1