Mule & Ehcache

This is about issues I ran into when I wanted to use ehcache in a mule application, but not within the flow (i.e., to cache the results of a flow component), or as part of a spring bean. I had a somewhat complex parser, and wanted to avoid re-parsing my strings, using cached results of previously parsed strings.

Sounds easy enough… I built an ehcache.xml and put it in my src/main/app directory. I created a singleton for my service with the tried & true GOF Singleton pattern. My constructor looked like this (keep in mind this is Groovy):

    private Parser() {
        CacheManager cacheMgr = CacheManager.newInstance()
                
        //Initialize a cache if it does not already exist
        if (cacheMgr.getCache("myCache") == null) {
            cacheMgr.addCache("myCache")
        }
                
        cache = cacheMgr.getCache("myCache")
    }

And I used the cache in this simple way:

    public String[] parse( String name ) {
        String[] namePair
        
        Element el = cache.get(name)
        
        if( !el ) {
            namePair = doMyComplexParsing( name )
            cache.put(new Element( name, namePair) )    
        } else {
            namePair = el.objectValue as String[]
        }
        
        return namePair
    }

In my unit tests, this all worked perfectly. Unfortunately, when I deployed my mule app, I received a message that the ehcache.xml file could not be found, so ehcache-failsafe.xml was loaded & used instead. Not good. I double check the deployed app; indeed, the ehcache.xml file is right where I expect it — at the root of my zip, and in the classes directory. It certainly should be found.

What I suspect is happening is that ehcache isn’t using the applications classloader for the classpath to search, but is using the mule server instead. Hence, the ehcache.xml isn’t where it is expect to be, relative to the mule server.

Fortunately, we hit upon a workaround. By obtaining the path to the loaded singleton, we can construct the CacheManager specifying this path as the one to use when loading ehcache.xml. Once we did that … all was good. Here’s what the resulting singleton constructor looked like:

    private Parser() {
        String directPath = this.class.getClassLoader().getResource("com/path/to/service/Parser.class").getPath()
        directPath = directPath.substring(0, directPath.indexOf( "/com/path/to/service/Parser.class" ) )
        
        CacheManager cacheMgr = CacheManager.newInstance( directPath + "/ehcache.xml" )
                
        if (cacheMgr.getCache("myCache") == null) {
            cacheMgr.addCache("myCache")
        }
                
        cache = cacheMgr.getCache("myCache")
    }
Posted in Software | Tagged , , | Leave a comment

Mule, maven, and provided-scope dependencies

We have a few mule projects which have started using the Microsoft jdbc 4.0 driver (namely, sqljdbc_auth.dll and the sqljdbc4-4.0.jar). One thing we ran into when we attempted to install multiple mule apps that used the driver into the same mule app server is that we’d get an error message that the dll was already loaded … and of course, at the point, an exception would be thrown and the app was essentially, uselessly, dead.

The solution is actually pretty straight-forward. You first want to specify the dependency on the driver as being a “provided” scope dependency within maven. Something like this works just fine:

<dependency>
  <groupId>com.microsfot.sqlserver</groupId>
  <artifactId>sqljdbc4</artifactId>
  <version>4.0</version>
  <scope>provided</scope>
</dependency>

This way, when you build the deployment zip file for your app, the sqljdbc4-4.0.jar is not included.

Next, make sure your mule app server has the sqljdbc_auth.dll in the $MULE_HOME/lib/native directory, and the sqljdbc4-4.0.jar in the $MULE_HOME/lib/user directory.

Finally, modify the mule wrapper.conf file to include the setting of the java library path so that provided dlls are picked up when the mule app container is started. We used

wrapper.java.library.path.3=C:\esb\mule\mymuleserver\lib\native

(where, of course, replace C:\esb\mule\mymuleserver with the path to your own server, $MULE_HOME).

Now, make sure no other apps in that particular mule app server references the provided dependency (in our case, the Microsoft jdbc driver). If it does … you’ll have to modify that projects dependency as described above.

If you’re clean … restart the mule server. The sqljdbc_auth.dll will now be loaded in the context of the mule server and not individual apps, and you won’t have collisions with multiple apps trying to load the dll.

This is applicable, of course, to any dependency on a jar and/or dll which may cause collisions when loaded/used by more than one app in a single mule container. The names may change, but the story is the same …

Posted in Software | Tagged | Leave a comment

Mule and getting the generated id of a newly inserted row …

There’s not a whole lot of documentation (that I found very clear, at least), so I thought I’d share this little Mule tidbit. In our flow, we end up inserting a single row into 2 sql server tables; the row that is inserted into the second table is linked to the first via a foreign key relationship to the primary key (auto-generated by the db) of the row we added to the first table. The question becomes … how do I get the id of the first row after I’ve inserted it.

The documentation for this is here, though it’s not really all that clear how to use it. What we’ve done is basically this:

First, do a <db:insert … > of the data that is going into the first table:

<db:insert config-ref="Generic_Database_Configuration" autoGeneratedKeys="true" autoGeneratedKeysColumnNames="MY_PRIMARY_KEY">
  <db:parameterized-query>
    <![CDATA[
      INSERT INTO my.DATA_TABLE1 ( FIELD1, FIELD2 )
      VALUES( #[payload['value1']], #[payload['value2']] )
    ]]>
  </db:parameterized-query>
</db:insert>

The key is that when this runs, the autogenerated primary key which will be stored in the column “MY_PRIMARY_KEY” will be returned from the insert. The question now is … how do you get that key? This is the part that we had to dig for. What we found out is that if you’re using h2, like we are for testing purposes, the return parameter is named “SCOPE_IDENTITY()”. If you are using MS SQL Server, then it’s returned a “GENERATED_KEYS”. So we created a variable this way:

<set-variable variableName="primaryId" value="#[groovy: message.payload[0]['GENERATED_KEYS'] ?: message.payload[0]['SCOPE_IDENTITY()']]"/>

This will work for either h2 or the version of MS Sql Server that we are using. You can then easily use this variable in the next insert, smoething like this:

<db:insert config-ref="Generic_Database_Configuration">
  <db:parameterized-query>
    <![CDATA[
      INSERT INTO my.DATA_TABLE2( FK_FIELD, FIELD3 )
      VALUES( #[primaryId], #[payload['value3']] )
    ]]>
  </db:parameterized-query>
</db:insert>

Note that we don’t use the autoGeneratedKeys and autoGeneratedKeysColumnNames, because we aren’t interested in the id of the row we inserted here.

And there you have it — insert a row, get the id, and insert a linked row.

Posted in Software | Tagged , , , , | 2 Comments

Splunk Dashboards and linking panels

Ah, this is an easy one, once you know how. Working with Splunk dashboards, we have a case where we have a couple of panels stacked on the left hand side of the screen, and on the right hand side, we have other dashboards that are dependent upon data on the left. Once you click to drill down onto a data item, the right hand panel should update to reflect the selection on the left.

The initial approach was just to manually edit the xml for the page and add a dropdown section like this:

<dropdown>
  <link>...</link>
</dropdown>

The problem with doing this is that the entire dashboard will refresh (and you have to be careful to include all parameters for all panels on the page if you want just the one update to occur).

Instead, if you read carefully near the end of the documentation here, you can just do something like this:

<dropdown>
  <set token="<field name>">$fieldvalue$</set>
</dropdown>

And viola! Now clicking to drill down doesn’t refresh the entire page via the link, but the new panel picks up the token it had specified in it’s data, and only that panel is refreshed.

Posted in Software | Tagged , , | Leave a comment

Loading data into h2 via a csv file

On the project I’m working on, we use h2 in our development environments, along with liquibase to manage the database configuration in all the environments. This way, we get a nice clean database during development, and while running all the various unit tests that we write. Our spring configuration includes a profile that we activate in our development environment, and that profile kicks up a liquibase configuration to load the database with test data via csv files.

What we noticed recently was that after adding a moderately significant amount of data to support a new feature we were working on, our build times increased dramatically. The loading of the data takes a significant amount of time, and every integration-level test loads this data when it spins up. Clearly not every integration test requires the same data, – so what we needed was a way to load data via csv files on demand, and skip loading the data via liquibase at the startup of each test.

After a fair amount of research, combing through h2 documentation and quite a number of forum posts, what I came up with is:

void loadDataFromCSV( String table, String csvResourceName ) {
  def stream = Thread.currentThread().getContextClassLoader().getResourceAsStream( csvResourceName )
  def line = stream.readLines()[0]
  def path = "classpath:${csvResourceName}"

  Query query = entityManager.createNativeQuery( "insert into ${table}(${line}) (select * from CSVREAD( '${path}', null, 'null=NULL') )" )
  query.executeUpdate()
}

Nice and compact, this takes the name of the table for which we want data inserted, and the name of a csv file located in a classpath-accessible location (for example, within our src/test/resources directory). The method then gets an InputStream to the file and picks off the first line (yes, inefficient for a large file, as readLines() reads the entire file), which contains the columns names … and passes everything to an h2 query using CSVREAD to draw in the classpath-accessible csv file.

The last tidbit in the CSVREAD is to set the interpretation of null values to be the text NULL. Liquibase, it seems, uses null or NULL to represent a null value, but h2 defaults to an empty string to represent null. We wanted consistency between the load of data via liquibase (we still use that when launching the application locally) and the load of data via h2. The one concession is that the all-uppercase NULL is the only representation of a null value we’ll use.

So far, this is working out quite well. By replacing the generic dependency on a preloaded database with this small method and loading data as required in each test, we’ve dropped our build times substantially (going from 22 minutes back down to a more normal 8 minutes).

Posted in Software | Tagged , , , , | Leave a comment

TeamCity and a failure during patch generation

Okay … made some changes to a project, leading up to deployment, when suddenly my CI build starts to fail. The error is “Failed to start build” with additional log information like “Failed to build patch for build #1655 (build id=15011), VCS root …. blah blah” and ending with “Cannot build patch: java.io.IOException: Low-level patch stream is closed.”

The logs are not very helpful, either. We’re using git, and TeamCity 8.0.3 (yeah, old, but the wheels turn slowly here). Some digging suggested that yeah, there’s an issue in 8.0.3, and it’s fixed in 8.0.6 … but in the meantime, pull the sources via the build server, not the TeamCity server. You can do that by editing your build configuration, going to the version control settings, and changing the VCS checkout mode to “Automatically on agent (if supported by VCS roots)”.

So, we did that only to find that git isn’t installed on the build box. No problem … grab msysgit and install and …. get an error in the build that the checkout failed. Not much info available. Try to run git by hand and see that it’s picking up libcurl.dll from some other location than the git install. Fortunately, that one has more, and the most useful thing I found is here. In the end, just copy C:\Program Files\Git\bin\libcurl.dll into C:\Program Files\Git\libexec\git-core and you’ll have no impact on any software … git will just look locally for that dll before checking for system-level installed versions.

Of course, why would some other software decide that the system directory is a good place to drop a dll, where other apps will pick it up? Grrrr…

Posted in Software | Leave a comment

MS Sql Server/Spring/Hibernate and deadlocks

Okay, been battling this one over the last week. We have the ability for users to submit report requests to our web server. As some reports take a while, what we do is queue the report request, and control how many simultaneous requests we process. Easy enough … we have a request table, plus a request parameters table, linked through a foriegn key.

Unfortunately, under load, when a new report request is created, we end up getting occasional deadlocks. I’ve poured through the code again and again, looking for what might cause the deadlock, and came across this. Apparently this can happen (from what I can tell), without a really clear cause. With MS Sql Server locking at a row level, and potentially upgrading the lock to a page or table lock, why would inserting multiple requests result in a deadlock? A block, and pause while one request is inserted, due to a lock, I can understand. But a deadlock?

My first thought was not not rely on Hibernate writing the data quite as much, at least, not in terms of having the entire ReportRequest written. While showing SQL, I noticed that the report parameters were being written out as well as the report results, even though they weren’t dirty. So I whipped up some HQL to just update the parameters once, with the initial request, and only information in the report request table would be updated as the report status (queued, running, finished) or the report data itself was changed.

No dice. Still run into a deadlock. I figured then, if I can’t prevent the deadlock from happening, the next best thing would be to handle the outcome of the deadlock, and perform a retry of the transaction. I grabbed some common retry logic that we had crafted a while back and had some measure of success. Deadlocks would happen, and the retry logic worked perfectly … reports were being completed successfully.

Hmmmm… let’s take a look at what my database connections look like. Generate 15 reports and I see, for a while, just 15 connections. Deadlocks happen … now it’s 17 … 25 … 42 … 55 …65 …75. Hit the max number of connections I’ve configured. Back to reviewing the code, walking through with a debugger, etc., etc. Did I close the transction? Did I close the session? Yes to all those questions. The session was definitely being disconnected and closed (it’s worth noting here that the session was being created via Spring and an explicit call because the report generation is occurring in a background thread, outside of any other session/transaction boundaries).

I ended up finding several things that helped, namely this, this, and … this. The short of it is that Spring, in their infinite wisdom, set the value of the hibernate property hibernate.connection.release_mode from “auto” to “on_close”. The hibernate folks don’t recommend using “on_close”, which closes & releases a connection when a session is closed. What “auto” is supposed to do is to pick either “after_statement” or “after_transaction” — something like that, depending upon the type of database you have (distributed or not). They don’t have “auto” pick “on_close”… but the Spring folks apparently like “on_close”. I added the setting into my configuration file, and with either “after_statement” or “after_transaction”, my connections were no longer being left in an open & active state. Hurrah.

Of course, I still don’t understand why even on_close wasn’t working, because I really am explicitly closing the session. In any case, I’m now set to use “on_transaction” (I figured the statement was a bit too aggressive for this change) and all is good with my world, at least. Connections open, do their work, and even in the case of a dead lock in the database, the connections properly close and are returned to the pool. No more exhausted connection pools.

Posted in Software | Tagged , , , , | 1 Comment