04 February 2010

Jboss ModeShape: A federating JCR repository

Some interesting stuff is happing in the JCR community. With Apache Jackrabbit 2.0.0 out (with JCR 2.0) and an interesting project called Jboss ModeShape almost reaching it's final 1.0 release. ModeShape recently came to my attention and it seems an interesting project. In this post I will give a short introduction of ModeShape and it's features.

What's ModeShape?

ModeShape is a Java Content Repository implementation which will support both JSR-170 and JSR-283. It's not trying to be just another isolated content repository, but a repository with a strong focus on content federation. In other words: ModeShape's main goal is to provide a single JCR interface for accessing and searching content coming from different back-end systems. These systems can even be of different sorts. You might think of a ModeShape repository containing information from a relation database, a file system and perhaps even another Java content repository like for instance Hippo CMS 7's content repository. You can configure these sources of information with the help of ModeShapes connector framework.

Connectors

One of ModeShape's key concepts is the concept of connectors. A connector will allow you to connect to a certain type of back-end system and transparently expose the information inside the ModeShape repository. In the current 1.0.0 beta release there are already a couple of out of the box connectors available:


  • In-Memory Connector
  • File System Connector
  • JPA Connector
  • Federation Connector
  • Subversion Connector
  • JBoss Cache Connector
  • Infinispan Connector
  • JDBC Metadata Connector 

That's already quite a few, but for the upcoming release they also have plans for expanding the set of connectors with for instance a JCR connector, which I find quite interesting myself, because that would allow you to expose other JCR implementations like Hippo CMS 7 (Apache JackRabbit) in combination with other systems through one JCR interface.

There are many other content solutions out there, so if you can't find a connector that suits your need, you can of course write one yourself and perhaps donate it to the ModeShape project.

Sequencers

One of ModeShapes other interesting features is the concept of sequencers. With sequencers you can gather additional information from a certain item inside the repository and store that extracted information in the repository. ModeShape has quite a few sequencers out of the box:


  • Compact Node Type (CND) Sequencer
  • XML Document Sequencer
  • ZIP File Sequencer
  • Microsoft Office Document Sequencer
  • Java Source File Sequencer
  • Java Class File Sequencer
  • Image Sequencer
  • MP3 Sequencer
  • DDL File Sequencer
  • Text Sequencers

The example below is of the ImageSequencer, which can gather information from certain types of images stored inside the repository. The ImageMetaDataSequencer is used here to extract metadata like size, dimensions and so on from the image if they have one of the specified extensions and the extracted information is stored somewhere else inside the repository.

JcrConfiguration config = ...
config.sequencer("Image Sequencer")
.usingClass("org.modeshape.sequencer.image.ImageMetadataSequencer")
.loadedFromClasspath()
.setDescription("Sequences image files to extract the characteristics of the image")
.sequencingFrom("//(*.(jpg|jpeg|gif|bmp|psd)[*])/jcr:content[@jcr:data]")
.andOutputtingTo("/images/$1");

Conclusion

With other mature JCR implementations out there I think ModeShapes strongest point is it's focus on content federation. Providing a single JCR interface for content stored in different systems is a great initiative, because the JCR API is quite easy to learn and to use. I see a bright future for ModeShape, since companies are sharing more and more in-house information on the web these days. I myself will try to keep a close eye on ModeShape and see how it evolves.