30 March 2009

Apache Camel: open source integration framework

I'm currently working on a project where we are looking at creating an integration layer for external applications to connect to our back-end applications. In our case, one of the back-end applications is Hippo CMS 7's repository.

I've been reading up on ESB's like Apache ServiceMix and Synapse, but even though both projects look very interesting, they actually are a bit too much for what I want to do. There was one project though that seems to be exactly what I want: Apache Camel.

About Apache Camel

Apache Camel is an open source Java framework that focuses on making integration easier. One of the great things is that Camel comes with a lot of default components and connectors.
Even though I was quite new to the integration concept, I was able to get my first Camel project up and running within 30 minutes or so, which I think is quite fast. You only need is a bit of Java/Spring knowledge to get going.

The basic concepts

While using an integration framework like Camel, you will have to keep four key terms in mind:

  • Endpoint: where the message comes in or leaves the integration layer
  • Route: how a message goes from endpoint A to endpoint B
  • Filter: the chained components that are involved in the process of handling a message that comes from endpoint A and goes to endpoint B. It could be that the content of the message needs to be transformed from SOAP to for instance ATOM.
  • Pipe: the way the message travels from endpoint A through filters to endpoint B

One of the things I'm looking at Camel for is using it to convert RSS feed entries into JCR nodes. If I would create an endpoint diagram, which would describe my route, it would look something like the image below.


With Camel, the endpoints and routes can be configured in a few lines of Java code or with Spring XML configuration. I started out with the Spring XML configuration and it was actually quite easy to get going. Here is an example where I poll my own RSS feed and store the items into a mock 'feeds' object.
<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:context="http://www.springframework.org/schema/context"
xsi:schemaLocation="
http://www.springframework.org/schema/beans
http://www.springframework.org/schema/beans/spring-beans-2.5.xsd
http://www.springframework.org/schema/context
http://www.springframework.org/schema/context/spring-context-2.5.xsd
http://camel.apache.org/schema/spring
http://camel.apache.org/schema/spring/camel-spring.xsd">

  <camelContext xmlns="http://camel.apache.org/schema/spring">
    <route>
      <from uri="rss://http://blog.jeroenreijn.com/feeds/posts/default?alt=rss" />
      <to uri="mock:feeds"/>
    </route>
  </camelContext>

</beans>

As you can see that's just a couple of lines of code. It's really that simple to do things in Camel. Of course this configuration does not end up in a JCR repository, but as an example I think it's quite easy to grasp. For those of you, that want to play around with Camel as well, I'll try to explain all the step I took to get a working web application example from here on. As I'm using Maven2 for building my projects, you should be able to reproduce my setup quite easily.

Setting up your maven project

First off we'll start with adding the camel dependencies to our maven project descriptor( pom.xml).
<dependencies>
  <dependency>
    <groupId>org.apache.camel</groupId>
    <artifactId>camel-core</artifactId>
    <version>${camel-version}</version>
  </dependency>
  <dependency>
    <groupId>org.apache.camel</groupId>
    <artifactId>camel-spring</artifactId>
    <version>${camel-version}</version>
  </dependency>
  <dependency>
    <groupId>org.springframework</groupId>
    <artifactId>spring-core</artifactId>
    <version>${spring-version}</version>
  </dependency>
  <dependency>
    <groupId>org.springframework</groupId>
    <artifactId>spring-web</artifactId>
    <version>${spring-version}</version>
  </dependency>
  <dependency>
    <groupId>org.apache.camel</groupId>
    <artifactId>camel-rss</artifactId>
    <version>${camel-version}</version>
  </dependency>
</dependencies>
As you can see I explicitly added the camel-rss component, so that my camel application knows how to handle rss feeds. Camel does not have it's own RSS parser, but is using Rome in the background for handling the RSS feeds. The Camel project is setup in such a way that you can include any component you want, by adding the needed component dependency to your pom.xml. If you're thinking about using Camel, make sure you checkout the components page, which shows you all of the currently available components.

Camel uses Spring, so we need to add the Spring ContextLoaderListener to the local web.xml in src/main/webapp/WEB-INF/.
<?xml version="1.0" encoding="UTF-8"?>
<web-app xmlns="http://java.sun.com/xml/ns/j2ee"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://java.sun.com/xml/ns/j2ee
http://java.sun.com/xml/ns/j2ee/web-app_2_4.xsd"
version="2.4">

  <listener>
    <listener-class>org.springframework.web.context.ContextLoaderListener</listener-class>
  </listener>
</web-app>
The last step in our process is defining our endpoints. In my case I chose to use the Spring XML configuration for defining my endpoints.

Add a file called applicationContext.xml to your src/main/webapp/WEB-INF/ folder.
Once the file is created you should be able to define your routes like this:

<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:context="http://www.springframework.org/schema/context"
xsi:schemaLocation="
http://www.springframework.org/schema/beans
http://www.springframework.org/schema/beans/spring-beans-2.5.xsd
http://www.springframework.org/schema/context
http://www.springframework.org/schema/context/spring-context-2.5.xsd
http://camel.apache.org/schema/spring
http://camel.apache.org/schema/spring/camel-spring.xsd">

  <camelContext xmlns="http://camel.apache.org/schema/spring">
   <route>
     <from uri="rss://http://blog.jeroenreijn.com/feeds/posts/default?alt=rss" />
     <to uri="mock:feeds"/>
  </route>
</camelContext>

</beans>
In this example I'm using my own RSS feed, but you can of course use any feed url you like.
For testing purposes you can add a log4j.properties file in src/main/resources/, so you can see the output of the Camel RSS component in your console. Here is the configuration I used writing this blogpost.


# The logging properties used for eclipse testing, We want to see debug output on the console.
log4j.rootLogger=INFO, out

log4j.logger.org.apache.camel=DEBUG

# uncomment the following line to turn on ActiveMQ debugging
# log4j.logger.org.springframework=INFO

# CONSOLE appender not used by default
log4j.appender.out=org.apache.log4j.ConsoleAppender
log4j.appender.out.layout=org.apache.log4j.PatternLayout
log4j.appender.out.layout.ConversionPattern=[%30.30t] %-30.30c{1} %-5p %m%n




Well that's it. Now the only thing you will need to do is fire up an application container, like Jetty and see what's going on in the console.

$ mvn jetty:run

If Jetty is running and everything is setup correctly you should be able to see some debug information come by that looks like:


SyndFeedImpl.author=noreply@blogger.com (Jeroen Reijn)
SyndFeedImpl.authors=[]
SyndFeedImpl.title=Jeroen Reijn
SyndFeedImpl.description=
SyndFeedImpl.feedType=rss_2.0
SyndFeedImpl.encoding=null
SyndFeedImpl.entries[0].contributors=[]


As you will see the RSS feed is parsed and converted into a SyndFeed object.
From there on you can make use of this object and perform any operation on it.

I must admit that while playing around with Camel and RSS feeds,
I noticed that the RSS (and Atom) component did not handle extra request parameters correctly, so I added a patch in the Camel JIRA, hoping it wil be included in the next release of Camel.
If you have issues with the RSS component and request parameters, you might want to try to build the Camel SVN trunk and apply my patch (CAMEL-1496).
This is only necessary if you want to parse a feed that has for instance a unique id as request parameter added to the feed URL.

We'll that's it! This post will get a follow-up, where I will show you have to use Camel to actually store the RSS feed entries into a JCR repository.

Here are a couple of good articles too read before starting with Camel:

This blogpost was inspired by an article over at Gridshore, where Jettro wrote a post on using Spring Integrations as integration framework. Since I'm pretty much Apache minded, I have been looking around for other open source integration frameworks within the ASF, which brought me to Apache Camel.

24 March 2009

Attended the Maven and Lucene Meetup

Since I'm not able to attend the actual ApacheCon conference days I decided to join some of the pre-conference meetups.

On monday evening I joined Jasha and some other fellow Hippos to the Maven meetup, were I was hoping to hear some new and interesting stuff. Maven has been one of my default development tools for over more then 3 years.This was the first time I actually met other users and developers from the Maven community. I think there were around 40 people at the meetup and some interesting topics came along.

Carlos Sanchez talked about the effort they're putting in the "Eclipse IAM" project over at the Eclipse foundation. I just recently started using the m2eclipse plugin in my Eclipse IDE, which seems to suite my needs, but the Eclipse IAM plugin looked really interesting. Towards the end of the Maven meetup Jasha and I left for a chat with Grzegorz Kossakowski. It was good seeing you again Grzegorz!

Tuesday evening was an evening packed with interesting meetups, but unfortunately they were all at the same time. In the end I went to the Lucene Meetup, where I stayed until 21.30. It was good to see so many people (60+) at the meetup and after an introduction round it was nice to found out that other open source content management vendors were also present at the meetup.

I was especially interested in the talks about Solr. Most of the attendees were Solr users as far as I could see. After some presentations the attendees started asking questions and talking about the number of documents in their Lucene/Solr indexes and I was a bit blown away by the amount of documents some of them had in their lucene index. It ranged from 300.000 to tens of millions, which is really a lot. For me, from a content management perspective, I really wondered what kind of content it was and I would love to see that amount of content in our CMS.

Well that was my ApacheCon for this year. I'm curious too see where the ApacheCon Europe will be held next year. Hope to see you all next year...

22 March 2009

This week: ApacheCon Europe 2009

This week is ApacheCon Europe week. I'm afraid I won't be able to attend any sessions during the actual conference, but I am going to attend some of the meetup sessions on Monday and Tuesday evening.
For those of you that are attending the conference, I can say that the program looks really promising and has a lot to offer.

If you're attending the ApacheCon on Thursday, be sure to check out Niels's talk on "Documentation: get it right!"

Using Daemon modules with Hippo CMS 7

Recently I was working on a new Hippo CMS 7 based project, where I was in need of a repository component that could run in the background and perform some scheduled tasks.

While talking to some colleagues about what I had to do, they pointed me to a build-in solution for adding repository components, which are initiated at startup.

It was actually very simple to implement this feature, so I'll try to describe how you can achieve the same solution in some very small steps.

The first thing you will need to do is create a Java class that implements the DaemonModule interface. As an example I've created the BackgroundModule as shown below.

package com.example.repository;

import javax.jcr.RepositoryException;
import javax.jcr.Session;

import org.hippoecm.repository.ext.DaemonModule;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

public class BackgroundModule implements DaemonModule{

  static final Logger log = LoggerFactory.getLogger(BackgroundModule.class);
  static Session session = null;

  public void initialize(Session session) throws RepositoryException {
    this.session = session; 
    log.info("BackgroundModule started"); 
  }

  public void shutdown() {
    session.logout();
  }

}



You might wonder how the repository knows about these daemon modules? Well the trick is that the repository goes through all 'MANIFEST.MF' files, which it can find on the classpath. If the MANIFEST.MF file contains an entry for the property 'Hippo-Modules', it will be added to the list of available modules. Once finished finding all modules it will start to initialize each of them and pass on an authorized JCR session, so you will be able to work with all information inside the repository.

I'm always using Maven 2 while working with CMS 7. Maven 2 has some usefull utilities and it can help you you out with adding the correct manifest entry. In my pom.xml I added some configuration for the maven-jar-plugin that adds my module to the manifest.

<plugin>
  <groupId>org.apache.maven.plugins</groupId>
  <artifactId>maven-jar-plugin</artifactId>
  <configuration>
    <archive>
      <manifest>
        <addDefaultImplementationEntries>true</addDefaultImplementationEntries>
      </manifest>
      <manifestEntries>
        <Hippo-Modules>com.example.repository.BackgroundModule</Hippo-Modules>
      </manifestEntries>
    </archive>
  </configuration>
</plugin>


If you need to add more then one module, you can do so by adding a space in between modules.

For the project I was doing, I also made use of Quartz triggers, so my module would execute once in a while instead of just after initialization of the repository.

The concept of these modules is quite powerful, so I hope this can help you to get started with writing your own Daemon modules.

Update

The above article describes the situation back in 2009. With the recent release of Hippo CMS 7.8 there is a slightly different way of creating these modules. For more information see the repository managed components page in the hippo documentation.

21 March 2009

Welcome to my new Blog

Well all my content has been migrated to Blogger and the default skin is gone. Goal for tomorrow is writing a new blog post on Daemon modules for Hippo Repository.