Monday, 31 January 2011

Storing Java objects with db4o

These days I was looking for a database that should hold some native Java objects in a very easy to use way. Because when I documented myself about the NoSQL databases I have read a few words about Versant db4o database, I thought this is the time to know more about it.

The idea behind this kind of database is to store the objects exactly as they exist in your application. No need for an additional relational layer which sometimes (often) maps the fields from a Java object to many fields in many table in a relational database. We have some immediately advantages from this approach:

- no need to install (or use) a RDBMS
- the access speed to the database is increased due to the fact that we could bypass the relational layer

The speed is also a strong point because under the hood the objects are stored in the form of graphs, which allow very efficient algoritms for reading, writing and searching.

The database is written for two programming languages: Java and .NET. So, I have started with the download of the Java API and the binaries of the database, version 7.12. The archive is around 48MB and after extracting it, we have some directories containing the sources, the Object Manager Enterprise, which is a kind of administrative tool, and a directory containing some documentation and tutorials about how to install and how to use the database.

The installation is very easy, and all that you have to do is to add in the classpath of your project (I used Eclipse to create a new project) of a library designed for your installed Java version. Personally I use Java 6, so I have chosen db4o-7.12.156.14667-all-java5.jar file.

Storing objects is very simple and intuitive. First we have to create an ObjectContainer, then use its methods which resembles with the one used in relational databases (I assume that many of us are familiarized with JDBC methods). So, for example we have an object Athlete and we want to store it in the database, we have to write (in an over-simplified way):

ObjectContainer db = Db4oEmbedded.openFile("db.dbo");
try{

Athlete athlete = new Athlete("Haile Gebrselasie", "Berlin Marathon", "2:03:59");
db.store(athlete1);
db.commit();
}catch(Exception e){
e.printStackTrace();
db.rollback();
}finally{
db.close();
}


To search information in the database, we need queries. Here, the queries work over the instances of the same kind of objects. So, we could have Queries By Example which work based on an input template and returns all the objects that match all non-default fields of the template. A more general type of queries are the Native Queries, which in fact are the recommended way to search information. The lowest level type of queries are the so called SODA Queries (Simple Object Data Access) and they work directly with the nodes of the database graph.  

For example retrieving all Athlet objects from the database using a Query By Example, first we have to define a prototype, then pass it to the query and wait for the result:

Athlete proto = new Athlete(null, null, null);
List res = db.queryByExample(proto);

System.out.println("Name \t\t Race \t\t Time");
for(Athlete crtAthlete : res){
System.out.println(crtAthlete.getName() + "\t\t" + crtAthlete.getRace() + "\t\t" + crtAthlete.getTime());
}


I wanted to compare the speed of the db4o database vs a truly RDBMS (in fact I have MySQL 5 installed on my machine). For this, I have defined a table called ATHLETS in my RDBMS, following the schema:

create table athlets(
id int unsigned not null auto_increment primary key,
name varchar(50),
race varchar(50),
besttime varchar(50)
) engine=InnoDB;


Then, for 1000 records having the Name, Race and BestTime fields filled, the insertion time was 2109 miliseconds.

For 1000 objects of type Athlet, with the same fields filled, the total insertion time was 106 miliseconds, so the db4o database is around 20 times faster than a relational database.

To manage the records in the database, Versant provide an Eclipse plugin called Object Manager Enterprise (OME) which is very easy to install and then it gives you the opportunity to access and query the records in the database.

I found this database very interesting and useful when need to store native Java objects. It has an amazing speed and also is very easy to use. The queries are checked directly in the compilation phase, so the parameters type too. The database is somehow schemaless. The object could change their structure then they are persisted as they are, no need for changes in the database layer (unlike RDMBS, where a change in the model implies the change in the database structure and more than that, a change in the SQL queries). It is embedded directly in the application, no need for an extra RDBMS to be installed somewhere locally or in the network. Being integrated in the application, it means that it is loaded in the same process. More than this, Versant db4o supports ACID transactions.

I found also some cons, one of them from my point of view, is that for commercial applications or if you want support from the database provider, you have to pay for it. There is a scheme of licensing based on the number of processor cores and for a strong server, the price could escalate easily.

No comments:

Post a Comment