Tag Archives: HBase

Value Generator Plugin for Datanucleus HBase will become part of Datanucleus 3

As promised I in my last blog post (by the way – I hope you liked it), I’ve released an enhanced version to GitHub so every one of you can download the source code and around with it. I also included a maven build file, so  it should not be too difficult to build the version locally and resolve all the dependencies.

The GitHub version contains the following enhancements:

  • better table name
  • use of fully qualified field name instead of name argument
  • enhanced logging
  • maven build file

The second news I wanted to share with you is that Andy Jefferson from Datanucleus suggested, that he will make this plugin code part of Datanucleus version 3. As I like this idea very much I gave him permission to do so and created a JIRA (NUCHBASE-26). So from Datanucleus version 3 on everyone will be able to use increment value generator strategy natively, without having to use a plugin. Checkout the Datanucleus Blog to see what other features they are working on for version 3.

HBase with JPA and Spring Roo

Inspired by Matthias Wessendorf’s blog entry “Apache Hadoop HBase plays nice with JPA” I started playing around with integrating HBase into Spring Roo.

Spring Roo is a lightweight  Java development tool which uses convention-over-configuration principles to provide rapid application development of Java-based enterprise software. It provides all the nice things like auto generating getters, setters, unit tests and persistence methods, scaffolding and so on. Therefore it makes heavy use of AspectJ to put all the auto generated code into separate files (with extension .aj) so you can safely change all Java files without interfering with Roo’s code generation engine. One of the best things in Roo (compared to Grails) is that everything generated is plain Java, so if you at some point in time don’t want to use Roo anymore you can easily merge all the aspects into your Java files and continue without using Roo (even though I wouldn’t recommend that). If you have not worked with Roo up to now check out the quick tutorial in the Roo Reference Documentation.

HBase is an open source, non-relational, distributed database modeled after Google’s BigTable and is written in Java. It is developed as part of Apache Software Foundation’s Hadoop project, providing BigTable-like capabilities for Hadoop. Therefore it’s a good alternative for all folks who do not want to host their application on the Google App Enginge.

The installation of Hadoop and HBase is relatively straight forward and well documented in the HBase wiki. According to the documentation there is also a way to set it up using windows and Cygwin, but I’ve to tried it. I did my test installation in a VMWare with Ubuntu Server 10.04 LTS.

The Datanucleus guys are offering a JPA and JDO integration for HBase and many other databases under the Apache 2 open source license. Event though the HBase plugin from Datanucleus still has some limitations (like no auto generated IDs), you can either work around those restrictions by using JPA in a slightly different way or writing a simple plugin to one of the dozens plugin points offered by Datanucleus. In my next blog I’ll show you how to auto generate IDs using the JPA annotation @GeneratedValue by writing your own simple Datanucleus plugin .

Continue reading