Monday, November 18, 2013

The precise meaning of I/O wait time in Linux

Some time ago I had a discussion with some systems guys about the exact meaning of the I/O wait time which is displayed by top as a percentage of total CPU time. Their answer was that it is the time spent by the CPU(s) while waiting for outstanding I/O operations to complete. Indeed, the man page for the top command defines this as the "time waiting for I/O completion".

However, this definition is obviously not correct (or at least not complete), because a CPU never spends clock cycles waiting for an I/O operation to complete. Instead, if a task running on a given CPU blocks on a synchronous I/O operation, the kernel will suspend that task and allow other tasks to be scheduled on that CPU.

So what is the exact definition then? There is an interesting Server Fault question that discussed this. Somebody came up with the following definition that describes I/O wait time as a sub-category of idle time:

iowait is time that the processor/processors are waiting (i.e. is in an idle state and does nothing), during which there in fact was outstanding disk I/O requests.

That makes perfect sense for uniprocessor systems, but there is still a problem with that definition when applied to multiprocessor systems. In fact, "idle" is a state of a CPU, while "waiting for I/O completion" is a state of a task. However, as pointed out earlier, a task waiting for outstanding I/O operations is not running on any CPU. So how can the I/O wait time be accounted for on a per-CPU basis?

For example, let's assume that on an otherwise idle system with 4 CPUs, a single, completely I/O bound task is running. Will the overall I/O wait time be 100% or 25%? I.e. will the I/O wait time be 100% on a single CPU (and 0% on the others), or on all 4 CPUs? This can be easily checked by doing a simple experiment. One can simulate an I/O bound process using the following command, which will simply read data from the hard disk as fast as it can:

dd if=/dev/sda of=/dev/null bs=1MB

Note that you need to execute this as root and if necessary change the input file to the appropriate block device for your hard disk.

Looking at the CPU stats in top (press 1 to get per-CPU statistics), you will see something like this:

%Cpu0  :  3,1 us, 10,7 sy,  0,0 ni,  3,5 id, 82,4 wa,  0,0 hi,  0,3 si,  0,0 st
%Cpu1  :  3,6 us,  2,0 sy,  0,0 ni, 90,7 id,  3,3 wa,  0,0 hi,  0,3 si,  0,0 st
%Cpu2  :  1,0 us,  0,3 sy,  0,0 ni, 96,3 id,  2,3 wa,  0,0 hi,  0,0 si,  0,0 st
%Cpu3  :  3,0 us,  0,3 sy,  0,0 ni, 96,3 id,  0,3 wa,  0,0 hi,  0,0 si,  0,0 st

This clearly indicates that a single I/O bound task only increases the I/O wait time on a single CPU. Note that you may see that occasionally the task "switches" from one CPU to another. That is because the Linux kernel tries to schedule a task on the CPU it ran last (in order to improve CPU cache hit rates). The taskset command can be used to "pin" a process to a given CPU so that the experiment becomes more reproducible (Note that the first command line argument is not the CPU number, but a mask):

taskset 1 dd if=/dev/sda of=/dev/null bs=1MB

Another interesting experiment is to run a CPU bound task at the same time on the same CPU:

taskset 1 sh -c "while true; do true; done"

The I/O wait time now drops to 0 on that CPU (and also remains 0 on the other CPUs), while user and system time account for 100% CPU usage:

%Cpu0  : 80,3 us, 15,5 sy,  0,0 ni,  0,0 id,  0,0 wa,  0,0 hi,  4,3 si,  0,0 st
%Cpu1  :  4,7 us,  3,4 sy,  0,0 ni, 91,3 id,  0,0 wa,  0,0 hi,  0,7 si,  0,0 st
%Cpu2  :  2,3 us,  0,3 sy,  0,0 ni, 97,3 id,  0,0 wa,  0,0 hi,  0,0 si,  0,0 st
%Cpu3  :  2,7 us,  4,3 sy,  0,0 ni, 93,0 id,  0,0 wa,  0,0 hi,  0,0 si,  0,0 st

That is expected because I/O wait time is a sub-category of idle time, and the CPU to which we pinned both tasks is never idle.

These findings allow us to deduce the exact definition of I/O wait time:

For a given CPU, the I/O wait time is the time during which that CPU was idle (i.e. didn't execute any tasks) and there was at least one outstanding disk I/O operation requested by a task scheduled on that CPU (at the time it generated that I/O request).

Note that the nuance is not innocent and has practical consequences. For example, on a system with many CPUs, even if there is a problem with I/O performance, the observed overall I/O wait time may still be small if the problem only affects a single task. It also means that while it is generally correct to say that faster CPUs tend to increase I/O wait time (simply because a faster CPU tends to be idle more often), that statement is no longer true if one replaces "faster" by "more".

Saturday, November 16, 2013

WebSphere & ApacheDS quick setup guide

This article explains how to quickly configure WebSphere with Apache Directory Server (ApacheDS) for LDAP authentication. We will use the ApacheDS server that comes packaged with Apache Directory Studio. This has the advantage that we only need a single tool to set up the LDAP server and to populate the directory. Obviously the setup described here is not meant for production uses; the goal is to rapidly create a working LDAP configuration for testing purposes. It is assumed that the reader is familiar with configuring security (and in particular standalone LDAP registries) in WebSphere. No prior experience with ApacheDS is required.

Start by setting up the LDAP server:

  1. Download, install and start Apache Directory Studio. The present article is based on version 2.0.0-M8, but the procedure should be similar for other versions.

  2. Using the "Servers" view, create a new ApacheDS server. There is no need to change the configuration; the default settings are appropriate for a test server. After the server has been created, start it:

  3. Create a connection to the server. To do this, right click on the server and choose "Create a Connection". The new connection should then appear in the "Connections" view. Double click on the connection to open it. You should see the following entries in the "LDAP Browser" view: dc=example,dc=com, ou=config, ou=schema and ou=system.

  4. Create two entries with RDN ou=users and ou=groups under dc=example,dc=com, both with object class organizationalUnit.

  5. For each test user, create an entry with object class inetOrgPerson under ou=users. For the RDN, use uid=<username>. Then fill in the cn and sn attributes (cn is the common name which should be the given name plus surname; sn is the surname alone). Also add a userPassword attribute.

  6. Under ou=groups, create as many groups as needed. There should be at least one group that will be mapped to the administrator role in WebSphere. For the object class, one can use either groupOfNames or groupOfUniqueNames. They are more or less the same, but the former is easier to set up, because Directory Studio will allow you to select members by browsing the directory. For the RDN, use cn=<groupname>. When using groupOfNames, Directory Studio will automatically open a dialog to select the first member of the group. Additional members can be defined by adding more values to the member attribute.

  7. Also define a uid=admin user that will be used as the primary administrative user in the WebSphere configuration. Since this is not a person, but a technical account, you can use the object classes account and simpleSecurityObject to create this user. Note that the uid=admin user doesn't need to be a member of any group.

The resulting LDAP tree should look as follows:

You can now configure the standalone LDAP registry in WebSphere. The settings are as follows:

  • Primary administrative user name: admin
  • Type of LDAP server: Custom
  • Host/port: localhost:10389 (if you kept the default configuration for ApacheDS, and the server is running on the same host)
  • Base distinguished name: dc=example,dc=com

You also need to specify the following properties in the advanced LDAP user registry settings:

  • User filter: (&(uid=%v)(|(objectclass=inetOrgPerson)(objectclass=account)))
  • Group filter: (&(cn=%v)(|(objectclass=groupOfNames)(objectclass=groupOfUniqueNames)))
  • User ID map: *:uid
  • Group ID map: *:cn
  • Group member ID map: groupOfNames:member;groupOfUniqueNames:uniqueMember

Monday, November 11, 2013

Deploying the WebSphere EJB thin client in ServiceMix

In a previous post I explained how to install the SIB thin client into ServiceMix and use it in a Camel route. In that post I used the JmsFactoryFactory API to create the JMS connection factory and the Queue object. However, it should also be possible to configure them in WebSphere and look them up using JNDI. Performing that JNDI lookup requires two additional libraries:

  • The EJB thin client: com.ibm.ws.ejb.thinclient_8.5.0.jar
  • The IBM ORB (Object Request Broker): com.ibm.ws.orb_8.5.0.jar

Both JARs can be found in the runtimes directory in the WebSphere installation. The latter is required only on non-IBM JREs. In the following I will make the assumption that ServiceMix is running on an Oracle JRE and that we need both JARs.

In a Java SE environment it is relatively straightforward to create a Camel configuration that uses the EJB thin client to look up the necessary JMS resources from WebSphere. Here is a sample configuration that is basically equivalent to the one used in the earlier post:

<beans xmlns="http://www.springframework.org/schema/beans"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xmlns:camel="http://camel.apache.org/schema/spring"
  xmlns:jee="http://www.springframework.org/schema/jee"
  xmlns:util="http://www.springframework.org/schema/util"
  xsi:schemaLocation="http://www.springframework.org/schema/jee
    http://www.springframework.org/schema/jee/spring-jee.xsd
    http://www.springframework.org/schema/beans
    http://www.springframework.org/schema/beans/spring-beans.xsd
    http://www.springframework.org/schema/util
    http://www.springframework.org/schema/util/spring-util.xsd
    http://camel.apache.org/schema/spring
    http://camel.apache.org/schema/spring/camel-spring.xsd">

  <camel:camelContext>
    <camel:route>
      <camel:from uri="file://test"/>
      <camel:to uri="sib:queue:jms/testQ"/>
    </camel:route>
  </camel:camelContext>
  
  <bean id="sib" class="org.apache.camel.component.jms.JmsComponent">
    <property name="connectionFactory">
      <jee:jndi-lookup resource-ref="false" jndi-name="jms/testQCF" environment-ref="env"
                       lookup-on-startup="false" expected-type="javax.jms.QueueConnectionFactory"/>
    </property>
    <property name="destinationResolver">
      <bean class="org.springframework.jms.support.destination.JndiDestinationResolver">
        <property name="jndiEnvironment" ref="env"/>
      </bean>
    </property>
  </bean>
  
  <util:properties id="env">
    <prop key="java.naming.factory.initial">com.ibm.websphere.naming.WsnInitialContextFactory</prop>
    <prop key="java.naming.provider.url">corbaloc:iiop:isis:2809</prop>
  </util:properties>
</beans>

The only requirements are:

  • The three JARs (the SIB thin client, the EJB thin client and the ORB) must be in the classpath.
  • A queue (jms/testQ) and a connection factory (jms/testQCF) must be configured in WebSphere. The provider endpoints must be set manually in the connection factory configuration. If you are using an existing connection factory, remember that specifying the provider endpoints is not required for applications running on WebSphere. Therefore it is possible (and even likely) that they are not set.
  • The provider URL must point to the BOOTSTRAP_ADDRESS of the application server. If the JNDI resources are configured on a WebSphere cluster, use a corbaloc URL with multiple IIOP endpoints.

The challenge is now to make that configuration work on ServiceMix. We will make the following assumptions:

  • The ServiceMix version is 4.5.3.
  • We will use the libraries from WAS 8.5.5.0.
  • The SIB thin client has already been deployed on ServiceMix using the instructions in my earlier post.

The remaining task is then to deploy the EJB thin client and the ORB. The EJB thin client is actually already packaged as an OSGi bundle, while the ORB is packaged as a fragment that plugs into the EJB thin client. Therefore it should be enough to install these two artifacts into ServiceMix. However, it turns out that this is not as simple as one would expect.

Problem 1: Missing required bundle org.eclipse.osgi

The first problem that appears is that after installing the EJB thin client and the ORB, an attempt to start the EJB thin client bundle results in the following error:

org.osgi.framework.BundleException: Unresolved constraint in bundle com.ibm.ws.ejb.thinclient [182]: Unable to resolve 182.0: missing requirement [182.0] module; (bundle-symbolic-name=org.eclipse.osgi)

Inspection of the manifests of these two artifacts indeed shows that they have the following directive:

Require-Bundle: org.eclipse.osgi

Obviously, IBM packaged these artifacts for the Equinox platform (which is also used by WebSphere itself). Because ServiceMix runs on Apache Felix, the bundle org.eclipse.osgi doesn't exist. Since the EJB thin client bundle has an activator, it is likely that the purpose of this directive is simply to satisfy the dependency on the org.osgi.framework package.

One possible solution for this problem would be to modify the manifests and replace the Require-Bundle directive by an equivalent Import-Package directive. However, there is another solution that doesn't require modifying the IBM artifacts. The idea is to create a "compatibility" bundle with the following manifest (and without any other content):

Manifest-Version: 1.0
Bundle-ManifestVersion: 2
Bundle-Name: Equinox compatibility bundle
Bundle-SymbolicName: org.eclipse.osgi
Bundle-Version: 0.0.0
Import-Package: org.osgi.framework
Export-Package: org.osgi.framework

The Export-Package directive makes the org.osgi.framework package available to the EJB thin client bundle. Since the compatibility bundle also imports that package, it will effectively be wired to the bundle that actually contains these classes (which must exist in any OSGi runtime because the org.osgi.framework package is part of the core OSGi API).

Problem 2: Constraint violation related to javax.transaction.xa

After installing the compatibility org.eclipse.osgi bundle, the EJB thin client bundle still fails to start. The error message is now:

org.osgi.framework.BundleException: Uses constraint violation. Unable to resolve module com.ibm.ws.ejb.thinclient [182.0] because it exports package 'javax.transaction.xa' and is also exposed to it from module org.apache.aries.transaction.manager [58.0] via the following dependency chain:

  com.ibm.ws.ejb.thinclient [182.0]
    import: (package=javax.jms)
     |
    export: package=javax.jms; uses:=javax.transaction.xa
  org.apache.geronimo.specs.geronimo-jms_1.1_spec [48.0]
    import: (package=javax.transaction.xa)
     |
    export: package=javax.transaction.xa
  org.apache.aries.transaction.manager [58.0]

Let's first decode what this actually means. The thin client bundle exports the javax.transaction.xa package, but it doesn't import it. That implies that it can't be wired to the javax.transaction.xa package exported by the Aries transaction manager bundle. At the same time the thin client imports the javax.jms package. The OSGi runtime choses to wire that import to the Geronimo JMS API bundle. The javax.jms package contains classes that refer to classes in the javax.transaction.xa package as part of their public API (see e.g. XASession). That is expressed by the uses constraint (and a corresponding Import-Package directive) declared by the Geronimo bundle. However, the OSGi runtime cannot wire that import back to the thin client because this would cause a circular dependency; it has to wire it to the Aries bundle. That however would cause an issue because the thin client bundle now "sees" classes in the javax.transaction.xa package loaded from two different bundles (itself and the Aries bundle). Therefore the OSGi runtime refuses to resolve the thin client bundle.

That sounds like a tricky problem, but the solution is astonishingly simple: just remove the Geronimo JMS bundle!

osgi:uninstall geronimo-jms_1.1_spec

After that you should restart ServiceMix so that it can properly rewire all bundles.

To see why this works, let's first note that in ServiceMix, the javax.transaction.xa package is configured for boot delegation (see the org.osgi.framework.bootdelegation property in etc/custom.properties). That means that classes in that package will always be loaded from the boot class loader, i.e. from the JRE. That in turn means that the issue detected by the OSGi runtime will actually never occur: no matter how imports and exports for javax.transaction.xa are formally wired together, it's always the classes from the JRE that will be loaded anyway. The uses:=javax.transaction.xa declaration in the Geronimo bundle is therefore effectively irrelevant and could be ignored.

Now recall that we made the assumption that the SIB thin client bundle is already installed. That bundle exports javax.jms as well, but since it also imports that package, this export will not be used as long as the Geronimo JMS bundle is installed. Let's have a closer look at the imports and exports of that bundle:

karaf@root> osgi:headers com.ibm.ws.sib.client.thin.jms

IBM SIB JMS Thin Client (181)
-----------------------------
Manifest-Version = 1.0
Specification-Title = sibc.client.thin.bundle
Eclipse-LazyStart = true
Specification-Version = 8.5.0
Specification-Vendor = IBM Corp.
Ant-Version = Apache Ant 1.8.2
Copyright = Licensed Materials - Property of IBM  5724-J08, 5724-I63, 5724-H88, 5724-H89, 5655-N02, 5733-W70  Copyright IBM Corp. 2007, 2009 All Rights Reserved.  US Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp.
Implementation-Version = WAS855.IM [gm1319.01]
Implementation-Vendor = IBM Corp.
Implementation-Title = sibc.client.thin.bundle
Created-By = pxi3260sr10-20111208_01 (SR10) (IBM Corporation)

Bundle-Vendor = IBM Corp.
Bundle-Localization = plugin
Bundle-RequiredExecutionEnvironment = J2SE-1.5
Bundle-Name = IBM SIB JMS Thin Client
Bundle-SymbolicName = com.ibm.ws.sib.client.thin.jms; singleton:=true
Bundle-Classpath = .
Bundle-Version = 8.5.0
Bundle-ManifestVersion = 2

Import-Package = 
 javax.jms,
 javax.resource,
 javax.resource.spi,
 javax.resource.spi.security,
 javax.management
Export-Package = 
 com.ibm.websphere.sib.api.jms,
 com.ibm.ws.sib.api.jmsra.impl,
 com.ibm.ws.sib.api.jms.impl,
 javax.resource,
 javax.resource.spi,
 javax.resource.spi.security,
 javax.management,
 javax.jms;version=1.1.0
Require-Bundle = 
 system.bundle

Interestingly, javax.transaction.xa isn't mentioned at all. Looking at the content of that bundle, one can also see that it neither contains that package. This means that the SIB thin client was packaged with the assumption that javax.transaction.xa is configured for boot delegation (while the Geronimo JMS API bundle doesn't rely on that assumption). This is exactly what we need in our case. By removing the Geronimo bundle, we force the OSGi runtime to use the javax.jms package exported by the SIB thin client, and that solves the issue.

The EJB thin client indeed starts properly after doing that:

[ 181] [Active     ] [            ] [       ] [   80] IBM SIB JMS Thin Client (8.5.0)
[ 182] [Active     ] [            ] [       ] [   80] WebSphere EJB Thin Client Runtime (8.0.0)
                                       Fragments: 183
[ 183] [Resolved   ] [            ] [       ] [   80] WebSphere ORB Fragment (8.0.0)
                                       Hosts: 182
[ 185] [Active     ] [            ] [       ] [   80] Equinox compatibility bundle (0.0.0)

Problem 3: Inconsistent javax.resource.spi packages

We can now deploy the Spring configuration shown earlier. It deploys and starts successfully, but when trying to use it (by dropping a file into the test directory), an error occurs. The relevant part of the stack trace is as follows:

com.ibm.websphere.naming.CannotInstantiateObjectException: Exception occurred while the JNDI NamingManager was processing a javax.naming.Reference object. [Root exception is java.lang.NoClassDefFoundError: Ljavax/resource/spi/TransactionSupport$TransactionSupportLevel;]
  at com.ibm.ws.naming.util.Helpers.processSerializedObjectForLookupExt
  at com.ibm.ws.naming.util.Helpers.processSerializedObjectForLookup
  at com.ibm.ws.naming.jndicos.CNContextImpl.processBoundObjectForLookup
  at com.ibm.ws.naming.jndicos.CNContextImpl.processResolveResults
  at com.ibm.ws.naming.jndicos.CNContextImpl.doLookup
  at com.ibm.ws.naming.jndicos.CNContextImpl.doLookup
  at com.ibm.ws.naming.jndicos.CNContextImpl.lookupExt
  at com.ibm.ws.naming.jndicos.CNContextImpl.lookup
  at com.ibm.ws.naming.util.WsnInitCtx.lookup
  at com.ibm.ws.naming.util.WsnInitCtx.lookup
  at javax.naming.InitialContext.lookup
  at org.springframework.jndi.JndiTemplate$1.doInContext
  at org.springframework.jndi.JndiTemplate.execute
  at org.springframework.jndi.JndiTemplate.lookup
  at org.springframework.jndi.JndiTemplate.lookup
  at org.springframework.jndi.JndiLocatorSupport.lookup
  at org.springframework.jndi.JndiObjectLocator.lookup
  at org.springframework.jndi.JndiObjectTargetSource.getTarget
  ... 50 more
Caused by: java.lang.NoClassDefFoundError: Ljavax/resource/spi/TransactionSupport$TransactionSupportLevel;
  at java.lang.Class.getDeclaredFields0
  at java.lang.Class.privateGetDeclaredFields
  at java.lang.Class.getDeclaredField
  at java.io.ObjectStreamClass.getDeclaredSUID
  at java.io.ObjectStreamClass.access$700
  at java.io.ObjectStreamClass$2.run
  at java.security.AccessController.doPrivileged
  at java.io.ObjectStreamClass.<init>
  at java.io.ObjectStreamClass.lookup
  at java.io.ObjectStreamClass.initNonProxy
  at java.io.ObjectInputStream.readNonProxyDesc
  at java.io.ObjectInputStream.readClassDesc
  at java.io.ObjectInputStream.readOrdinaryObject
  at java.io.ObjectInputStream.readObject0
  at java.io.ObjectInputStream.readObject
  at com.ibm.ejs.j2c.ConnectionFactoryBuilderImpl.getObjectInstance
  at javax.naming.spi.NamingManager.getObjectInstance
  at com.ibm.ws.naming.util.Helpers.processSerializedObjectForLookupExt
  ... 67 more

The error occurs when Spring attempts to look up the JMS connection factory from JNDI. It is actually caused by an issue in the packaging of the IBM artifacts. The SIB and EJB thin clients both have javax.resource.spi in their Import-Package and Export-Package directives. Since that package is not exported by any other bundle deployed on ServiceMix, the OSGi runtime has two possibilities to resolve this situation: either it wires the javax.resource.spi import from the EJB thin client to the SIB thin client bundle or vice versa. The problem is that the javax.resource.spi package in the SIB thin client is incomplete: it contains fewer classes than the same package in the EJB thin client. If the OSGi runtime selects the package from the SIB thin client bundle, then this leads to the NoClassDefFoundError shown above.

One solution would be to change the order of installation of the two bundles in order to convince the OSGi runtime to select the javax.resource.spi exported by the EJB thin client. However, this would be a very fragile solution. A better solution is to add another bundle that exports the full javax.resource.spi package (without importing it). In that case, the OSGi runtime only has a single possibility to wire the imports/exports for that package, namely to use the version exported by the third bundle. Such a bundle actually exists in the WebSphere runtime and adding it to ServiceMix indeed solves the problem:

osgi:install file:///opt/IBM/WebSphere/AppServer/plugins/javax.j2ee.connector.jar

Problem 4: Class loading issues related to the IBM ORB

After installing that bundle, you should restart ServiceMix to allow it to rewire the bundles properly. The JNDI lookup of the connection factory now succeeds, but another failure occurs when Spring tries to create a connection:

java.lang.NoClassDefFoundError: com/ibm/CORBA/iiop/ORB
  at java.lang.Class.forName0
  at java.lang.Class.forName
  at com.ibm.ws.util.PlatformHelperFactory.getBackupHelper
  at com.ibm.ws.util.PlatformHelperFactory.getPlatformHelper
  at com.ibm.ws.sib.trm.client.TrmSICoreConnectionFactoryImpl.<clinit>
  at java.lang.Class.forName0
  at java.lang.Class.forName
  at com.ibm.ws.sib.trm.TrmSICoreConnectionFactory.<clinit>
  at com.ibm.wsspi.sib.core.selector.SICoreConnectionFactorySelector.getSICoreConnectionFactory
  at com.ibm.wsspi.sib.core.selector.SICoreConnectionFactorySelector.getSICoreConnectionFactory
  at com.ibm.ws.sib.api.jmsra.impl.JmsJcaConnectionFactoryImpl.createCoreConnection
  at com.ibm.ws.sib.api.jmsra.impl.JmsJcaConnectionFactoryImpl.createCoreConnection
  at com.ibm.ws.sib.api.jmsra.impl.JmsJcaConnectionFactoryImpl.createConnection
  at com.ibm.ws.sib.api.jms.impl.JmsManagedConnectionFactoryImpl.createConnection
  at com.ibm.ws.sib.api.jms.impl.JmsManagedConnectionFactoryImpl.createConnection
  at sun.reflect.NativeMethodAccessorImpl.invoke0
  at sun.reflect.NativeMethodAccessorImpl.invoke
  at sun.reflect.DelegatingMethodAccessorImpl.invoke
  at java.lang.reflect.Method.invoke
  at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection
  at org.springframework.aop.framework.JdkDynamicAopProxy.invoke
  at com.sun.proxy.$Proxy52.createConnection
  at org.springframework.jms.support.JmsAccessor.createConnection
  at org.springframework.jms.core.JmsTemplate.execute
  at org.apache.camel.component.jms.JmsConfiguration$CamelJmsTemplate.send
  at org.apache.camel.component.jms.JmsProducer.doSend
  at org.apache.camel.component.jms.JmsProducer.processInOnly
  at org.apache.camel.component.jms.JmsProducer.process
  ... 42 more
Caused by: java.lang.ClassNotFoundException: com.ibm.CORBA.iiop.ORB not found by com.ibm.ws.sib.client.thin.jms
  at org.apache.felix.framework.ModuleImpl.findClassOrResourceByDelegation
  at org.apache.felix.framework.ModuleImpl.access$400
  at org.apache.felix.framework.ModuleImpl$ModuleClassLoader.loadClass
  at java.lang.ClassLoader.loadClass
  ... 70 more

Interestingly, if one installs the ORB fragment before the EJB thin client, then the Camel route fails much earlier (during the creation of the InitialContext) with an error that is different, but related to the same com.ibm.CORBA.iiop.ORB class:

org.omg.CORBA.INITIALIZE: can't instantiate default ORB implementation com.ibm.CORBA.iiop.ORB  vmcid: 0x0  minor code: 0  completed: No
  at org.omg.CORBA.ORB.create_impl
  at org.omg.CORBA.ORB.init
  at com.ibm.ws.orb.GlobalORBFactory.init
  at com.ibm.ejs.oa.EJSORBImpl.initializeORB
  at com.ibm.ejs.oa.EJSClientORBImpl.<init>
  at com.ibm.ejs.oa.EJSClientORBImpl.<init>
  at com.ibm.ejs.oa.EJSORB.init
  ... 68 more
Caused by: java.lang.ClassCastException: com.ibm.CORBA.iiop.ORB cannot be cast to org.omg.CORBA.ORB
  at org.omg.CORBA.ORB.create_impl
  ... 74 more

It is not clear whether this is a packaging issue in the IBM artifacts or a bug in Karaf/Felix. A solution is to add the ORB to the endorsed libraries instead of installing it as an OSGi fragment. At first this might seem to be an ugly workaround, but it actually makes sense. In the IBM JRE, these classes are part of the runtime libraries. By adding them to the endorsed libraries one basically makes the Oracle JRE look a bit more like an IBM JRE.

Note that if we endorse the IBM ORB, then it is actually more naturally to use the JARs shipped with the IBM JRE instead of the ORB bundle. These JARs can be found in the java/jre/lib directory in the WebSphere installation. We need the following JARs from that directory: ibmcfw.jar, ibmorb.jar and ibmorbapi.jar. After copying these files to the lib/endorsed directory in the ServiceMix installation, remove the ORB fragment:

osgi:uninstall com.ibm.ws.orb

To make the ORB classes visible to the EJB thin client it is necessary to add org.omg.* and com.ibm.* to the org.osgi.framework.bootdelegation property in etc/custom.properties. Note that this would also be necessary on an IBM JRE. It means that the EJB thin client assumes that boot delegation is enabled for these packages.

Problem 5: Missing classes from com.ibm.ws.bootstrap

After restarting ServiceMix, one now gets the following error:

java.lang.NoClassDefFoundError: Could not initialize class com.ibm.ws.sib.trm.client.CredentialType

If one looks at the first occurrence of the error, one can see that the failure to initialize the CredentialType class is caused by the following exception:

java.lang.ClassNotFoundException: com.ibm.ws.bootstrap.BootHandlerException not found by com.ibm.ws.sib.client.thin.jms

Inspection of the content of the JMS thin client bundle shows that it contains the com.ibm.ws.bootstrap package, but is missing the BootHandlerException class. That class is actually part of lib/bootstrap.jar in the WebSphere runtime. We could add that JAR to the endorsed libraries, but in contrast to the ORB classes, this is not a natural solution. It is actually enough to add it to the main class loader used to load Karaf. This can be done by copying the JAR to the lib directory in the ServiceMix installation. Note that the classes will be visible to the SIB thin client because we already added com.ibm.* to the boot delegation list before.

After adding bootstrap.jar and restarting ServiceMix, the sample route now executes successfully! Interestingly bootstrap.jar is not required when executing the sample in a Java SE environment. This means that the issue occurs on a code path that is only executed in an OSGi environment.

Summary and conclusion

To summarize, the following steps are necessary to deploy the SIB and EJB thin clients as OSGi bundles in ServiceMix:

  1. Copy the following files from the WebSphere installation to the lib/endorsed directory:

    • java/jre/lib/ibmcfw.jar
    • java/jre/lib/ibmorb.jar
    • java/jre/lib/ibmorbapi.jar

    On an IBM JRE, this step would be skipped.

  2. Copy lib/bootstrap.jar from the WebSphere installation to the lib directory.

  3. Add org.omg.* and com.ibm.* to the org.osgi.framework.bootdelegation property in etc/custom.properties.

  4. Create and install the Equinox compatibility bundle as described above.

  5. Install the following bundles from the WebSphere runtime:

    • plugins/javax.j2ee.connector.jar
    • runtimes/com.ibm.ws.ejb.thinclient_8.5.0.jar
    • runtimes/com.ibm.ws.sib.client.thin.jms_8.5.0.jar
  6. Uninstall the geronimo-jms_1.1_spec bundle.

We have also seen that the SIB and EJB thin client bundles have several packaging issues. In particular they appear to have been bundled under the assumption that a certain number of packages are configured for boot delegation. As already argued in relation to the dependency on the org.eclipse.osgi bundle, the reason is probably that they were created for Equinox as a target OSGi runtime. In fact, the assumptions made about boot delegation are compatible with the default configuration in Equinox. What is more interesting is the fact that these packages also include com.ibm.ws.bootstrap. That package is visible through boot delegation only in WebSphere, but the thin clients are obviously not supposed to be deployed as OSGi bundles in WebSphere...

It should also be noted that the solution was only tested with a very simple scenario. It is possible that in more complex scenarios, additional issues arise.

Finally, given the difficulties to install the thin clients as OSGi artifacts, one may reasonably argue that it might actually be simpler to just repackage them...

Monday, November 4, 2013

How to divide a WebSphere topology into cells

A WebSphere cell is a logical grouping of nodes (each of which runs one or more application servers) that are centrally managed:

  • There is a single configuration repository for the entire cell. Each individual node receives a read-only copy of the part of the configuration relevant for that node.
  • There is a single administrative console for the entire cell. This console is hosted on the deployment manager and allows to manage the configuration repository as well as the runtime state of all WebSphere instances in the cell.
  • The MBean servers in the cell are federated. By connecting to the deployment manager, one can interact with any MBean on any WebSphere instance in the cell.

One of the primary tasks when designing a WebSphere topology is to decide how WebSphere instances should be grouped into cells. There is no golden rule, and this generally requires a tradeoff between multiple considerations:

  1. Applications deployed on different clusters can easily communicate over JMS if the clusters are in the same cell. The reason is that SIBuses are cell scoped resources and that each WebSphere instance in a cell has information about the topology of the cell, so that it can easily locate the messaging engine to connect to. This means that making two applications in the same cell interact with each other over JMS only requires minimal configuration, even if they are deployed on different clusters. On the other hand, doing this for applications deployed in different cells requires more configuration because WebSphere instances in one cell are not aware of the messaging topology in the other cell.

  2. Setting up remote EJB calls over IIOP between applications deployed on different clusters is easier if the clusters are in the same cell: the applications don't need to make any particular provisions to support this, and no additional configuration is required on the server. In that case, making two applications interact over IIOP only requires using a special JNDI name (such as cell/clusters/cluster1/ejb/SomeEJB) that routes the requests to the right target cluster. On the other hand, doing this for applications deployed in different cells requires additional configuration:

    • A foreign cell binding needs to be created between the cells.
    • For cells where security is enabled, it is also required to establish trust between these cells, i.e. to exchange the SSL certificates and to synchronize the LTPA keys.
    • Routing and workload management for IIOP works better inside a cell (actually inside a core group, but there is generally a single core group for the entire cell), because the application server that hosts the calling application knows about the runtime state of the members of the target cluster. To get the same quality of service for IIOP calls between different cells it is necessary to set up core group bridges between the core groups in these cells, and the complexity of the bridge configuration is O(N2), where N is the number of cells involved.
  3. Applications are defined at cell scope and then mapped to target servers and clusters. This implies that application names must be unique in a cell and that it is not possible to deploy multiple versions of the same application under the same name. Deploying multiple versions of the same application therefore requires renaming that application (by changing the value of the display-name element in the application.xml descriptor). Note that this works well for J2EE applications, but not for SCA modules deployed on WebSphere Process Server or ESB. The reason is that during the deployment of an SCA modules, WebSphere automatically creates SIBus resources with names that depend on the original application name. In this case, changing application.xml is not enough.

  4. A single Web server instance can be used as a reverse proxy for multiple clusters. However, WebSphere can only maintain the plug-in configuration automatically if the Web server and the clusters are all part of the same cell. Using a single Web server for multiple clusters in different cells is possible but additional procedures are required to maintain that configuration. This means that the larger the cells are, the more flexibility one has for the Web server configuration.

  5. One possible strategy to upgrade WebSphere environments to a new major version is to migrate the configuration as is using the tools (WASPreUpgrade and WASPostUpgrade) provided by IBM. The first step in this process is always to migrate the deployment manager profile. WebSphere supports mixed version cells (as long as the deployment manager has the highest version), so that the individual nodes can be migrated one by one later. Larger cells slightly reduce the amount of work required during an upgrade (because there are fewer deployment managers), but at the price of increased risk: if something unexpected happens during the migration of the deployment manager, the impact will be larger and more difficult to manage.

  6. Some configurations are done at the cell level. This includes e.g. the security configuration (although that configuration can be overridden at the server level). Having larger cells reduces the amount of work required to apply and maintain these configurations.

  7. There are good reasons to use separate cells for products that augment WebSphere Application Server (such as WebSphere Process Server), although technically it is possible to mix different products in the same cell:

    • The current releases of these products is not necessarily based on the latest WebSphere Application Server release. Since the deployment manager must be upgraded first, this may block the upgrade to a newer WebSphere Application Server release.
    • Typically, upgrades of products such as WPS are considerably more complex than WAS upgrades. If both products are mixed in a single cell, then this may slow down the adoption of new WAS versions.

Some of these arguments are in favor of larger cells, while others are in favor of smaller cells. There is no single argument that can be used to determine the cell topology and one always has to do a tradeoff between multiple considerations. There are however two rules that should always apply:

  • A cell should never span multiple environments (development, acceptance, production, etc.).
  • There is a document from IBM titled Best Practices for Large WebSphere Application Server Topologies that indicates that a (single cell) topology is considered large if it contains dozens of nodes with hundreds of application servers. Most organizations are far away from these numbers, so that in practice one can usually consider that there is no upper limit on the number of application servers in a cell.

Sunday, November 3, 2013

Integrating ServiceMix with WebSphere's SIBus

This article describes how to integrate Apache ServiceMix with WebSphere's SIBus. More precisely we will explore how to deploy a Camel route that sends messages to a SIBus destination in WebSphere. We assume that connection factories and queue objects are created using the API described in the Programming to use JMS and messaging directly page in the WebSphere infocenter instead of looking them up using JNDI. This makes the configuration considerably simpler because there is no need to create JNDI objects in the WebSphere configuration.

In this scenario, it's enough to install the SIB thin client and we don't need the EJB thin client and IBM ORB (as would be the case in a scenario that uses JNDI lookups). The SIB thin client can be found in the runtimes directory of the WebSphere installation. It is actually packaged as an OSGi bundle that can be deployed out of the box in ServiceMix. This has been successfully tested with the client from WAS 7.0.0.25 and 8.5.5.0. Note that earlier 8.5 versions seem to have some issues because they actually require the EJB thin client and IBM ORB.

To deploy the SIB thin client, simply use the following command in the ServiceMix console (Adapt the path and version as necessary):

osgi:install -s file:///opt/IBM/WebSphere/AppServer/runtimes/com.ibm.ws.sib.client.thin.jms_8.5.0.jar

The thin client should then appear in the list of deployed bundles as follows (Use the osgi:list command to display that list):

[ 182] [Active     ] [            ] [       ] [   80] IBM SIB JMS Thin Client (8.5.0)

We can now create and deploy a Camel route. We will do that using a Spring context file. As mentioned earlier, the necessary connection factory and queue objects will be created using the com.ibm.websphere.sib.api.jms.JmsFactoryFactory API. Since Spring supports creating beans using static and instance factory methods (including factory methods with parameters) this can be done entirely in the Spring configuration without writing any code.

The following sample configuration sets up a Camel route that reads files from a directory and sends them to a SIBus destination:

<beans xmlns="http://www.springframework.org/schema/beans"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="
    http://www.springframework.org/schema/beans
    http://www.springframework.org/schema/beans/spring-beans.xsd
    http://camel.apache.org/schema/spring
    http://camel.apache.org/schema/spring/camel-spring.xsd">

  <camelContext xmlns="http://camel.apache.org/schema/spring">
    <route>
      <from uri="file://test"/>
      <to uri="sib:queue:testQ"/>
    </route>
  </camelContext>
  
  <bean id="jmsFactoryFactory" class="com.ibm.websphere.sib.api.jms.JmsFactoryFactory"
        factory-method="getInstance"/>
  
  <bean id="testQ"
        factory-bean="jmsFactoryFactory" factory-method="createQueue">
    <constructor-arg>
      <value>queue://test</value>
    </constructor-arg>
  </bean>
  
  <bean id="testCF" factory-bean="jmsFactoryFactory" factory-method="createConnectionFactory">
    <property name="busName" value="test"/>
    <property name="providerEndpoints" value="isis:7276:BootstrapBasicMessaging"/>
    <property name="targetTransportChain" value="InboundBasicMessaging"/>
  </bean>
  
  <bean id="sib" class="org.apache.camel.component.jms.JmsComponent">
    <property name="connectionFactory" ref="testCF"/>
    <property name="destinationResolver">
      <bean class="org.springframework.jms.support.destination.BeanFactoryDestinationResolver"/>
    </property>
  </bean>
</beans>

To run this sample, change the queue name, bus name and the provider endpoint as required by your environment. Then copy the Spring context to the deploy directory in your ServiceMix installation. This should create a test directory where you can put the files to be sent to the SIBus destination.