Sunday, November 15, 2009

Exciting new feature in WebSphere Process Server 7

A couple of years ago, I've worked in a project where we used MQ Workflow, which is one of the ancestors of WebSphere Process Server. Some of the business processes that we implemented were long running, and by long I mean really long, that is up to several weeks. It happened to us that one of the processes that we deployed in production had a minor bug that had not been discovered during testing. If I remember well, it was just a missing data connector between two activities. Even if that was a minor bug, it completely corrupted the state of the process when that particular transition was triggered.
Based on the number of running process instances with the incorrect template, and the probability that the particular transition is triggered in the process, we were able to estimate the number of instances that would terminate abnormally. Fortunately the number (both the estimated and the actual) turned out to be small (around 10), so that the business impact was quite low. Nevertheless this was a very frustrating experience because you can only sit and wait until another process instance becomes corrupted. It was not possible to proactively fix the running instances because MQ Workflow didn't allow you to migrate running process instances to a new version of the process template (and recreating these instances using some custom built ad hoc tool was far too risky).
Ever since that incident, whenever I meet somebody who happens to be (or pretends to be) a specialist in BPM, I always ask the question of how to address this type of issue. I even asked that during a IBM training session on WebSphere Process Server. I never got a satisfactory answer. It was only when I worked for Accenture that I discovered that some smart guys in their labs had studied that issue and come up with a pattern to solve it. If I remember well, the pattern somehow suggested to implement a single business process using three different BPEL processes that would then interact together. Even if the overall process is long running, one of these BPELs would only be short running so that it could be replaced by a new version at any time. Obviously this type of pattern far from optimal since it is expensive to implement and tends to further increase the gap between the process designed by the business analyst and the BPEL executing this process.
Recently IBM announced the release of WPS 7 and the announcement mentions the following new feature: "Deliver migration of running processes to new process model versions". If this is really what I think it is (and if the IBM people are able to deliver what they promise, which of course they have always been ;-), then this will be a major step forward.

EJB and Web Services: getting the best of both worlds

If you have ever worked in a project where both EJBs and Web Services are used, it is very likely that you have gotten into discussions about whether a given component should be implemented as an EJB or a Web Service. You might also wonder what is the best way to bridge between these two technologies. For example, you might have been in a situation where you wanted to reuse an existing EJB in a BPEL process. In this post I will demonstrate how you can avoid these questions by making your service implementations independent of the protocol used to invoke them. While this type of protocol independence can also be achieved using SCA, in this post I will focus on EJB 3.0 and JAX-WS 2.1 because these standards are part of JEE 5 and are in wider use than SCA.
Before describing the pattern, it might be useful to explain why EJB is still relevant as an integration and remote invocation protocol:
  • EJB relies on Java serialization and binary protocols such as IIOP which are more efficient than SOAP/HTTP.
  • Propagation of the transaction and security context is built into EJB from the ground up.
  • Most EJB containers have support for load balancing and failover with a degree of reliability that is not as easy to achieve with Web Services.
It is also clear what are the major drawbacks of using EJBs in a Service Oriented Architecture:
  • Since EJBs are not (necessarily) described by a WSDL interface, it is not easy to reuse them, e.g. in a BPEL process.
  • EJB is specific to the Java platform and interoperability with other platforms (or e.g. XML appliances) is limited.
Obviously it is possible to expose a component both as a traditional EJB and a Web service, e.g. by wrapping the EJB in a Web service interface. This however doesn't achieve true protocol independence because switching between EJB remote invocation and SOAP is not transparent to the consumer of the service. It also requires additional effort when reusing an EJB that has not yet been wrapped as a Web service. What I will show in this post is that by leveraging the new features introduced in EJB 3.0 and by carefully designing the EJBs it is possible to provide a protocol independent client view with minimal effort and without making any concessions in terms of best practices in Web service design.

Starting with version 3.0, the EJB specification allows to expose stateless session beans as JAX-WS style Web services. More precisely, a stateless session bean now may have up to three different types of client views: remote, local and Web service. The basic idea of the pattern proposed here is to make the choice between EJB remote invocation and SOAP transparent to the client by using the same Java interface for the remote and Web service views. Taking into account best practices in Web service design, the procedure can be summarized as follows:
  1. Design the service contract using WSDL and XML schema.
  2. Use JAX-WS (wsimport) to generate a corresponding Java interface. Since this artifact will be used by the bean implementation and the (Java) clients, it is strongly recommended to make extensive use of JAX-WS and JAXB bindings to customize the code generation so that the end result is a convenient and easy to use API. It is also recommended to package the generated artifacts in a separate JAR that can be referenced by the bean implementation as well as the client.
  3. Create a stateless session bean implementing the interface and declare this interface as both the remote and Web service view of the bean. Note that the interface that the bean needs to implement is the one generated from the portType in the WSDL, i.e. the one annotated with @WebService.
While this looks simple and straightforward, there are however some additional points that need to be taken into account. The first is that in order to be a valid remote view, the interface must conform to RMI rules. The good news is that starting with EJB 3.0, the methods of a remote interface are no longer required to declare RemoteException. However, the restriction that all method arguments must be serializable is of course still applicable. This is not a fundamental issue since the classes generated by JAXB are simple POJOs that only refer to primitive types, serializable types such as String, Date, etc., collections and other generated POJOs. Therefore they can be made serializable by adding the Serializable interface. Fortunately, a simple JAXB customization (xjc:serializable) is sufficient to do this. It should also be noted that while it is allowed to use a single interface for the remote and Web service views, it is not possible to use the same interface as local and remote view. This problem is easy to overcome with the usual pattern of creating two interfaces that extend the interface generated by JAX-WS and declare them as local and remote respectively.
One should be aware that customizing the JAX-WS code generation is a task that requires effort that should not be underestimated, at least if one wants a clean Java interface. A complete discussion of the best practices in this area is out of the scope of this article, but I would like to draw attention to a feature of JAX-WS which is not very well known but which is important to make sure that JAX-WS generates a Java interface with convenient method signatures for operations using the document/literal style. This feature is called "Wrapper Style" and allows JAX-WS to unwrap the request object. A more detailed description of this feature as well as the criteria that the WSDL must meet can be found in section of the JAX-WS 2.1 specification.
A second point that I would like to mention is that when customizing the code generation, one has to choose between declaring the JAX-WS and JAXB bindings inline in the WSDL and the schemas or using a separate binding file. Very often it is argued that these bindings are specific to the implementation of the service provider and/or consumer and should therefore be separated from the WSDL. This is certainly true in cases where the service is always invoked using SOAP. On the other hand, if the service can also be invoked as an EJB, one can argue that since the JAX-WS and JAXB bindings together with the WSDL completely describe the EJB interface, they are an integral part of the service contract and should be added to the WSDL. From this point of view, the JAX-WS/JAXB bindings are similar to the wsdl:binding elements mapping the abstract interface to the SOAP/HTTP protocol. Note however that while the abstract interface (i.e. the portType) and the SOAP/HTTP bindings can be separated into two WSDL files, this is not possible for the JAX-WS/JAXB bindings.

Assuming that all necessary customizations have been done and that a local interface is not required, the declaration of the stateless session bean would look as follows:
import employee.EmployeeService;

public class EmployeeServiceBean implements EmployeeService {
Note that the methods of the bean don't need any special annotations since they are all present on the interface. Except for container specific procedures (e.g. running endptEnabler on WebSphere), no further action is required to expose the bean as a Web service. It can now be invoked as an EJB or a Web service, and if the client is implemented in Java, the same Java interface can be used for both invocation styles.
Let's look more closely at the latter aspect, i.e. the invocation from a Java client. If the client should invoke the service as an EJB, we can use the @EJB annotation to let the container inject a reference:
private EmployeeService employeeService;
On the other hand, if the client should use SOAP, then we can use the @WebServiceRef annotation. Note that this annotation can be used to inject either the interface annotated with @WebService (corresponding to the portType in the WSDL) or the interface annotated with @WebServiceClient (corresponding to the service element in the WSDL). Since the latter is not meaningful when invoking the service as an EJB, we use the first approach in order to achieve protocol independence:
private EmployeeService employeeService;
In this sample, EmployeeServiceClient is the @WebServiceClient annotated interface (A JAX-WS binding has been used to assign this class name). As you can see, switching between EJB and SOAP is just a matter of changing the annotation. Note that in a JEE 5 compliant container, @WebServiceRef can be used wherever @EJB is recognized, in particular in session beans and servlets. Of course these references can alternatively be declared in the deployment descriptor and looked up using JNDI.
Making a @WebServiceRef work properly is actually a bit more tricky than @EJB. Two conditions must be met:
  • The WSDL file must be available. Ideally it should be included (together with all dependency artifacts such as imported WSLDs and schemas) as a resource in the JAR that contains the JAX-WS generated artifacts.
  • The wsdlLocation attribute of the @WebServiceClient annotation must be set correctly. This must either be an absolute URL (if the WSDL is not included in the JAR) or specify the location relative to the root of the module (see section 4.2.2 of JSR109 v1.2). Since the @WebServiceClient annotated interface is part of the code generated by JAX-WS, this can only be done by correctly configuring wsimport, namely using the -wsdlLocation option (or the wsdlLocation configuration element when using jaxws-maven-plugin).
Another important point is that the endpoint URI used by the container to invoke the service defaults to the one specified in the soap:address element in the WSDL. Since the endpoint URI is environment specific, it needs to be overridden at deployment time. This step is container specific. E.g. in WebSphere 7, endpoint URIs can be changed in the settings of the EJB or Web module in the admin console.

To conclude the discussion we should also address the question of where this pattern should be used. It would certainly be wrong to use this pattern for all services and it would also be wrong to use it for all EJBs. It really depends on the granularity of the service. On one end of the spectrum, we have course grained composite services that may potentially be implemented using BPEL or as a mediation flow. Here the pattern is not meaningful and the services should be implemented as Web services. On the other end of the spectrum, we have EJBs representing very fine grained services or components that should not be invoked directly by consumers in a different business domain. These services are best implemented as pure EJBs. The pattern is actually the most useful in the middle of the spectrum, where we find services that in a traditional J2EE architecture would be designed following the session facade pattern. Implementing session facades using the pattern described here makes them highly reusable and allows you to get the best of both worlds. E.g. a Web application co-located with the EJBs may use the local interfaces for maximum efficiency, while the same service is reused in a business process by invoking it as a Web service through a partner link in BPEL.
Interestingly, applying the pattern systematically actually has the additional benefit of enforcing proper usage of the good old Session Facade pattern. To see this, let's recall the goals of using session facades:
  • Provide a simpler interface to the clients by hiding all the complex interactions between business components.
  • Reduce the number of business objects that are exposed to the client across the service layer over the network.
  • Hide from the client the underlying interactions and interdependencies between business components. This provides better manageability, centralization of interactions (responsibility), greater flexibility, and greater ability to cope with changes.
  • Provide a uniform coarse-grained service layer to separate business object implementation from business service abstraction.
  • Avoid exposing the underlying business objects directly to the client to keep tight coupling between the two tiers to a minimum.
Usually the Session Facade pattern is combined with the Transfer Object pattern, i.e. the facade doesn't use entity objects (entity beans in EBJ 1.x and 2.x; JPA entity classes in EJB 3.0) directly, but value objects specifically designed for that facade. It is easy to see how these patterns can be enforced using the pattern described in this article:
  • The WSDL-first approach forces the designer of the service to make the contract independent of the underlying business components and to use the right level of abstraction.
  • The POJOs generated by JAXB are in general not suitable for use as entity classes in JPA. While at first glance this may seem to be a drawback of the pattern, it actually enforces the Transfer Object pattern and avoids exposing the underlying business objects directly to the client. At the same time, the code generation step relieves the developer from the task of manually creating the classes used in the Transfer Object pattern.
  • Keeping the volume of interactions over the network at the right level is a common goal for EJBs and Web services. This concern is probably easier to address in a WSDL-first approach.

Saturday, November 14, 2009

Creating a test data source in WebSphere 7

If you need to quickly set up a test database and a corresponding JDBC data source in WebSphere 7, you can use the preconfigured Derby JDBC provider for that purpose. Here is the procedure:
  • Choose a directory to store the database files. Make sure that the user ID running the server process has write access to the parent directory. Don't create the directory yet. It will be created automatically by Derby.
  • In the admin console, create a new data source with the following properties:
    • JDBC Provider: Derby JDBC Provider (existing)
    • Database name: the path of the directory chosen above
    • Authentication aliases: none
  • Go to the "Custom properties" page for the data source and change the value of the "createDatabase" property to "create".
  • Save the changes to the master configuration.
  • In the "Data sources" overview page, select the newly created data source and click "Test connection". This should create and start the database (you can verify this by looking at the configured file system directory).
When you no longer need the database, just remove the data source and delete the database directory.

Note that the following restrictions apply to data sources created using this procedure:
  • The database will be empty. Thus, the approach works best for applications able to create the database schema themselves (with JTA, use the "openjpa.jdbc.SynchronizeMappings" property in persistence.xml).
  • The database can't be used in a cluster.
  • The data source doesn't support XA.

Thursday, November 12, 2009

Euphemism of the day: restoring backward compatibility

Today somebody from IBM did the following commit on the Axiom project:
-        } else if ("".equals(symbolicName)) {
+ } else if ("IBM".equals(symbolicName)) {
I gently pointed out that the change looks strange and is probably a mistake (symbolicName and vendor are attributes extracted from an OSGi bundle manifest):
Shouldn't this be "IBM".equals(vendor) instead of "IBM".equals(symbolicName)???
Shortly afterwards, a new commit:
-        } else if ("IBM".equals(symbolicName)) {
+ } else if ("IBM".equals(vendor) ||
"".equals(symbolicName)) {
Guess what was the commit comment?
Need to insure that the dialect detector remains backwards compatible
So, if you don't want to say "I fixed a bug that I introduced", just say "I restored backward compatibility"...

PS: That reminds me of the story where IBM tried to hide the fact that the first version of their StAX parser didn't conform to the StAX specifications. Maybe I will blog about this story some day.

Wednesday, November 4, 2009

Quote of the day

By means of ever more effective methods of mind-manipulation, the democracies will change their nature; the quaint old forms -- elections, parliaments, Supreme Courts and all the rest -- will remain. The underlying substance will be a new kind of non-violent totalitarianism. All the traditional names, all the hallowed slogans will remain exactly what they were in the good old days. Democracy and freedom will be the theme of every broadcast and editorial [...]. Meanwhile the ruling oligarchy and its highly trained elite of soldiers, policemen, thought-manufacturers and mind-manipulators will quietly run the show as they see fit.
Aldous Huxley, Brave New World Revisited, 1958

Sunday, November 1, 2009

Understanding StAX: how to correctly use XMLStreamWriter

Note: This is a slightly edited version of a text that I wrote for the Axiom documentation. Some of the content is based on a reply posted by Tatu Saloranta on the Axiom mailing list. Tatu is the main developer of the Woodstox project.

Semantics of the setPrefix and setDefaultNamespace methods

The meaning and precise semantics of the setPrefix and setDefaultNamespace methods defined by XMLStreamWriter is probably one of the most obscure aspects of the StAX specifications. As we will see later, even the people who wrote the first version of IBM's StAX parser (called XLXP-J) failed to implement these two methods correctly. In order to understand how these method are supposed to work, it is necessary to look at different parts of the specification (For simplicity we will concentrate on setPrefix):

  • The Javadoc of the setPrefix method.
  • The table shown in the Javadoc of the XMLStreamWriter class in Java 6.
  • Section 5.2.2, “Binding Prefixes” of the StAX specification.
  • The example shown in section 5.3.2, “XMLStreamWriter” of the StAX specification.

In addition, it is important to note the following facts:

  • The terms defaulting prefixes used in section 5.2.2 of the specification and namespace repairing used in the Javadocs of XMLStreamWriter are synonyms.
  • The methods writing namespace qualified information items, i.e. writeStartElement, writeEmptyElement and writeAttribute all come in two variants: one that takes a namespace URI and a prefix as arguments and one that only takes a namespace URI, but no prefix.

The purpose of the setPrefix method is simply to define the prefixes that will be used by the variants of the writeStartElement, writeEmptyElement and writeAttribute methods that only take a namespace URI (and the local name). This becomes clear by looking at the table in the XMLStreamWriter Javadoc. Note that a call to setPrefix doesn't cause any output and it is still necessary to use writeNamespace to actually write the namespace declarations. Otherwise the produced document will not be well formed with respect to namespaces.

The Javadoc of the setPrefix method also clearly defines the scope of the prefix bindings defined using that method: a prefix bound using setPrefix remains valid till the invocation of writeEndElement corresponding to the last invocation of writeStartElement. While not explicitly mentioned in the specifications, it is clear that a prefix binding may be masked by another binding for the same prefix defined in a nested element. (Interestingly enough, BEA's reference implementation didn't get this aspect entirely right.)

An aspect that may cause confusion is the fact that in the example shown in section 5.3.2 of the specifications, the calls to setPrefix (and setDefaultNamespace) all appear immediately before a call to writeStartElement or writeEmptyElement. This may lead people to incorrectly believe that a prefix binding defined using setPrefix applies to the next element written. This interpretation however is clearly in contradiction with the setPrefix Javadoc.

Note that early versions of IBM's XLXP-J were based on this incorrect interpretation of the specifications, but this has been corrected. Versions conforming to the specifications support a special property called, which always returns Boolean.FALSE. This allows to easily distinguish the non conforming versions from the newer versions. Note that in contrast to what the usage of the reserved prefix suggests, this is a vendor specific property that is not supported by other implementations.

To avoid unexpected results and keep the code maintainable, it is in general advisable to keep the calls to setPrefix and writeNamespace aligned, i.e. to make sure that the scope (in XMLStreamWriter) of the prefix binding defined by setPrefix is compatible with the scope (in the produced document) of the namespace declaration written by the corresponding call to writeNamespace. This makes it necessary to write code like this:

writer.writeStartElement("p", "element1", "urn:ns1");
writer.setPrefix("p", "urn:ns1");
writer.writeNamespace("p", "urn:ns1");

As can be seen from this code snippet, keeping the two scopes in sync makes it necessary to use the writeStartElement variant which takes an explicit prefix. Note that this somewhat conflicts with the purpose of the setPrefix method; one may consider this as a flaw in the design of the StAX API.

The three XMLStreamWriter usage patterns

Drawing the conclusions from the previous section and taking into account that XMLStreamWriter also has a “namespace repairing” mode, one can see that there are in fact three different ways to use XMLStreamWriter. These usage patterns correspond to the three bullets in section 5.2.2 of the StAX specification:

  1. In the “namespace repairing” mode (enabled by the property), the writer takes care of all namespace bindings and declarations, with minimal help from the calling code. This will always produce output that is well-formed with respect to namespaces. On the other hand, this adds some overhead and the result may depend on the particular StAX implementation (though the result produced by different implementations will be equivalent).

    In repairing mode the calling code should avoid writing namespaces explicitly and leave that job to the writer. There is also no need to call setPrefix, except to suggest a preferred prefix for a namespace URI. All variants of writeStartElement, writeEmptyElement and writeAttribute may be used in this mode, but the implementation can choose whatever prefix mapping it wants, as long as the output results in proper URI mapping for elements and attributes.

  2. Only use the variants of the writer methods that take an explicit prefix together with the namespace URI. In this usage pattern, setPrefix is not used at all and it is the responsibility of the calling code to keep track of prefix bindings.

    Note that this approach is difficult to implement when different parts of the output document will be produced by different components (or even different libraries). Indeed, when passing the XMLStreamWriter from one method or component to the other, it will also be necessary to pass additional information about the prefix mappings in scope at that moment, unless the it is acceptable to let the called method write (potentially redundant) namespace declarations for all namespaces it uses.

  3. Use setPrefix to keep track of prefix bindings and make sure that the bindings are in sync with the namespace declarations that have been written, i.e. always use setPrefix immediately before or immediately after each call to writeNamespace. Note that the code is still free to use all variants of writeStartElement, writeEmptyElement and writeAttribute; it only needs to make sure that the usage it makes of these methods is consistent with the prefix bindings in scope.

    The advantage of this approach is that it allows to write modular code: when a method receives an XMLStreamWriter object (to write part of the document), it can use the namespace context of that writer (i.e. getPrefix and getNamespaceContext) to determine which namespace declarations are currently in scope in the output document and to avoid redundant or conflicting namespace declarations. Note that in order to do so, such code will have to check for an existing prefix binding before starting to use a namespace.

Die Nazis und das Öl

Wer sich immer schon gewundert hat, wie Nazi-Deutschland einen Weltkrieg hat führen können ohne nennenswerte eigene Erdöl-Vorkommen, der sollte diesen Artikel bei "einestages" (Spiegel) lesen. Besonders interessant ist folgende Stelle:
Schon 1940 existierte ein "Öl-Plan" der Briten, der vorsah, 17 deutsche Hydrierwerke anzugreifen und die Produktion mit einem "fatal blow" stillzulegen. Doch Churchill setzte andere strategische Schwerpunkte, drängte auf Flächenbombardements. Bis zum Mai 1944 wurde gerade einmal ein Prozent der gesamten alliierten Bombentonnage auf Öl-Ziele abgeworfen.
Die Briten wussten also schon vor Hitlers Überfall auf die Sowjetunion, wie man die deutsche Kriegsmaschinerie hätte empfindlich schwächen können, haben damit aber bis kurz vor der Landung in der Normandie gewartet...