Understanding MongoDB Querying in Grails and GORM

May 28, 2013

In this article, I will talk about how Grails, GORM in particular, maps various types of queries into MongoDB queries. Understanding the query mapping is extremely important in order to help understand and identify how performance bottlenecks might incur. This is especially true in the non-relational aspect of MongoDB where it is entirely possible that Grails could end up making several roundtrips to MongoDB to fill out the entire object model. If you are unfamiliar with MongoDB within Grails and GORM, read my previous post.

One of the nice features of Hibernate is that it can log all SQL generated queries to the log or console. This is helpful in analyzing how queries are generated, joins are mapped, etc. However, this support does not exist yet in the MongoDB plugin. As a result, there is no easy way to identify those queries. Instead, we can use external tools in MongoDB, such as the helpful mongosniff tool. The tool should reside in the same bin directory as the mongodb server application.

There are a few ways to actually run the tool. The first way is by sniffing the actual network for packets. This works great in Unix-based environments (Linux, Mac OS, etc). This example may not work well on Windows (see the next example below instead)

Start the MongoDB server in Terminal 1 (set the proper database path as necessary)

#> mongodb --dbpath /data/db

At the end of the output that gets generated will be the associated port, such as the default 27017:


[initandlisten] waiting for connections on port 27017

Now that the MongoDB server is running, startup the sniff tool in a second terminal.

#> mongosniff --source NET lo0

In this example, I am sniffing network traffic on the loopback address as I am running the server and Grails on the same development machine. If you are running the server on its own server, make sure to specify the proper network adapter.

The second way to run MongoDB is by writing queries and commands to a diagnostic log. The log can contain read-only queries, read and write commands, or none (which is the default). The sniff tool can then read that diagnostic log. In this example, I am only interested in the read queries (hence option 2) (see the docs on the diaglog option for more information)

Start the server in Terminal 1 (make sure the dbpath is set to the proper location)

#> mongodb --dbpath /data/db --diaglog 2

This will generate a file named diaglog.XXXXX in the database directory (i.e.: /data/db in the example above). Take note of that file in order to pass to the sniff tool. You may have to do a ls or dir on the database directory to find the actual filename.

Start the sniff tool in a second terminal (change the filename accordingly)

#> mongosniff --source DIAGLOG /data/db/diaglog.XXXXX

This will now dump the contents of the file including all the generated queries. Note that you may need to run both commands as administrator (i.e.: sudo).

To test either of these scenarios to validate they work, without using Grails, use the mongo command line application to query the database (see the Mongo documentation for examples).

Now that we know how to sniff the queries, let’s see how Grails interacts with MongoDB. In the examples I will demonstrating, I am using a data model similar to the following (note that I do not suggest modeling this use case in this exact scenario; I am purely using it to exercise various queries):

class Address {
    String street
    String city
}
 
class Role {
    boolean admin
    String type
    String description
}
 
class User {
    static embedded = [ 'address' ]
 
    String first
    String last
    Role role
    Address address
}
 
class Face {
    User user
    Integer x
    Integer y
    Integer w
    Integer h
    String description
}
 
class Photo {
    static embedded = [ 'faces' ]
 
    User user
    String description
    List<String> tags
    List<Face> faces
}
 
class Gallery {
    static embedded = [ 'photos' ]
 
    User user
    String name
    String description
    List<Photo> photos
}

Basically, a User has an external reference to a particular role and also an embedded address. A gallery contains an embedded set of photos and a photo contains an embedded set of tags and faces. The face also has an external reference to a User.

To showcase the different examples, I’m going to use the Grails console to interact and directly query the data from Groovy.

#> grails console

This command will popup the Groovy Console, but the console will be loaded with the Grails application environment. This allows direct access to GORM such as:

import com.test.*
User.list()

This should return an empty list as no data has been inserted yet. If we go to the sniff tool, we should see a line containing query such as:


127.0.0.1:52343 -->> 127.0.0.1:27017 test.user 43 bytes id:2 2
query: {} ntoreturn: 0 ntoskip: 0
127.0.0.1:27017 <<-- 127.0.0.1:52343 36 bytes id:1 1 - 2 reply n:0 cursorId: 0

The portions to note is the first line contains test.user which represents the name of the database (Grails application name typically) and the name of the table (lowercase name of the domain object class). The query: {} represents the type of query that was done. In this example, we are querying everything via the GORM list method. As such, no query was needed. The last line represents the number of returned results reply n:0. In this case, no data was returned.

Let's create some test data now so we can query real data.

import com.test.*
 
def adminRole = new Role(type:'Admin', admin:true, description:'Admin Users').save(flush:true)
def userRole = new Role(type:'User', admin:false, description:'Users').save(flush:true)
def guestRole = new Role(type:'Guest', admin:false, description:'Guest Users').save(flush:true)
 
def user = new User(first:'John', last:'Doe', role:userRole, address:new Address(street:'1234 Main St', city:'Somewhere')).save(flush:true)
def guest = new User(first:'Guest', last:'User', role:guestRole, address:new Address(street:'N/A', city:'N/A')).save(flush:true)
 
1..5.each { idx ->
    def tags = []
    for (def i = 0; i < idx; i++) {
        tags << "tags${i}"
    }
 
    new Photo(user:user, description:"Photo ${idx}", tags:tags, faces:[ new Face(user:guest, x:0, y:0, w:50, h:50, description:'none') ]).save(flush:true)
}
 
new Gallery(user:user, name:'My Photos', description:'photos', photos:[
    new Photo(user:user, description:"Gallery Photo 1", tags:[ 'nature', 'people' ], faces:[]),
    new Photo(user:user, description:"Gallery Photo 2", tags:[ 'family', 'people' ], faces:[ new Face(user:guest, x:0, y:0, w:50, h:50, description: 'none') ])
]).save(flush:true)

Running that script should insert a bunch of data to the Mongo database. This will now allow us to query the data. For example, let's start by querying a simple role by its type with the dynamic finders in GORM:

import com.test.*
Role.findByType('User')

Invoking this query and then looking at our sniff tool will show the following query:


127.0.0.1:52425 -->> 127.0.0.1:27017 test.role 58 bytes id:3d 61
query: { type: "User" } ntoreturn: -1 ntoskip: 0
127.0.0.1:27017 <<-- 127.0.0.1:52425 113 bytes id:2d 45 - 61 reply n:1 cursorId: 0 { _id: 2, admin: false, description: "Users", type: "User", version: 0 }

The MongoDB query that Grails created was { type: "User" } which correlates directly to our dynamic finder. From the result set, you will notice the _id: 2 which is the MongoDB id. Every domain model in Grails has an implicit id property that maps to the _id field in MongoDB. By default, the value is automatically assigned as a Long. Now that we know the id, we can directly fetch it:

import com.test.*
Role.get(2)

In this example, we end up getting a query of { _id: 2 }. The Role object contains no other associations. What happens if we query a specific User that has an association to Role?

import com.test.*
User.get(1)

This produces the single query { _id: 1 }. But what about the role association? Grails, by default, lazily fetches the associations when requested. This is a big benefit in MongoDB where the relations are done by Grails on the server requiring multiple invocations to MonogDB. If we change our script to User.get(1).getRole(), we now end up with two queries: one for the user and a second to the specific role. This is important to remember as the more associations you have, the more roundtrip latency you will incur. It is best to limit associations to only where they are essential.

So far, so good. Let's try some more complicated queries.

Executing the call User.findAllByFirstInList([ 'John', 'Guest' ]) to find all users named either John or Guest results in the query { first: { $in: [ "John", "Guest" ] } }. You will notice that Grails directly uses the MongoDB $in operator to map the queries. In fact, Grails does its best to natively use the built-in MongoDB mechanisms to satisfy its invocations.

This is even true with more complex operations such as User.findAllByFirstLike('J%'). The like keyword is directly part of SQL dialects, but not in MongoDB. However, MongoDB does support regular expressions which Grails maps into: { first: /^\QJ\E.*\Q\E$/ }.

Grails also maps conjunctions such User.findAllByFirstAndLastNotEqual('John', 'User') into queries { first: "John", last: { $ne: "User" } }. The or conjunction also works via User.findAllByFirstOrLast('John', 'Doe') producing { $or: [ { first: "John" }, { last: "Doe" } ] }.

Relational associations are even directly supported. The simple query: User.findAllByRole(Role.get(2)) produces the following query: { role: 2 }. You will notice that Grails automatically gets the id of the given role and passes it as the numerical reference to MongoDB.

Embedded objects may directly query properties through criteria queries.

def criteria = User.createCriteria()
criteria.list {
    address {
        eq('city', 'Somewhere')
    }
}

This results in the following native query: { address.city: "Somewhere" }. You will notice Grails uses dot-notation to directly map the query. However, this only works on embedded data. If you attempt to use a query against a relational association, you will get an UnsupportedOperationException.

// this fails
def criteria = User.createCriteria()
criteria.list {
    role {
        eq('admin', false)
    }
}

This is where it is important to properly model your data and reduce the amount of relational associations. MongoDB works best with data that has a single root element where most of the data may be embedded in the element (within reason, of course).

Using embedded data, criteria queries can even search within collections (note that there currently appears to be an issue with using findAll { … } or where with this mode, so use criteria queries directly…see https://gist.github.com/nicholashagen/5664999):

def criteria = Gallery.createCriteria()
criteria.list {
   photos {
       faces {
           between('w', 25, 75)
       }
   }
}

This produces the query { photos: { $elemMatch: { faces: { $elemMatch: { w: { $gte: 25, $lte: 75 } } } } } }. Note how Grails maps to the $elemMatch operator in MongoDB to match the sub-object collections. Also note how Grails properly maps the between operator to $gte and $lte operators.

For collections of non-objects such as List, the inList function in Grails can check for the specific existence of an item.

Photo.findAll {
   inList('tags', 'tags3')
}

This produces the MongoDB query { tags: { $in: [ "tags3" ] } } correlating to the $in operator.

The next set of examples is sorting and limiting data. Grails supports both concepts in both dynamic finders and criteria queries.

// dynamic finders
Photo.findAllByUser(User.get(2), [ sort: 'description', order: 'asc', offset: 3, max: 2 ])
 
// criteria queries
Photo.findAll {
    eq('user', User.get(2))
    order 'description', 'asc'
    firstResult(3)
    maxResults(2)
}

Both of these examples produce the MongoDB query: query: { query: { user: 2 }, orderby: { description: 1 } } ntoreturn: 2 ntoskip: 3. Notice the orderby clause and the ntoreturn and ntoskip clauses.

Finally, let's talk data aggregation. To count the number of photos for a given user, the following dynamic finder may be used: Photo.countByUser(User.get(2)). This results in the query: { count: "photo", query: { user: 2 } }. Note that it uses a count node along with a query node. The query limits the results while the count provides an aggregate count. The actual MongoDB response is { n: 5.0, ok: 1.0 } which Grails maps to 7.

In essence, Grails does all the dirty work of setting up the optimized MongoDB collections allowing you, as a developer, to work directly on the data model. Seeing the queries in MongoDB and how and when Grails performs relational association mappings is important to understanding bottlenecks later.

2 Responses to “Understanding MongoDB Querying in Grails and GORM”

  1. Great article. Much needed docs to make up for the poorly documented MongoDB plugin. I have a question about extending your example above. For example say I have this query:

    Photo.findAll {
    inList( ‘tags’, [ ‘tag1’, ‘tag2’] )
    }

    In the query above inList() operator returns all Photos that have either tag1 or tag2. However, I want a query that returns all Photos that contain both tag1 AND tag2. Fortunately there is an operator in MongoDB that does that: $all. However, I don’t know how to use it from GORM’s criteria query. Any ideas?

  2. Looking quickly at the main source code (https://github.com/SpringSource/grails-data-mapping/blob/3affa752bc9e6a26b3af5e61eef9a365ee71ac04/grails-datastore-mongo/src/main/groovy/org/grails/datastore/mapping/mongo/query/MongoQuery.java), I do not believe $all is supported at this point, but I could be wrong. I would ask on the Grails User Mailing List and then file a bug in the Grails tracking system to have that feature added.