Aggregations in MongoDB with Spring Data

Aggregations in MongoDB

The MongoDB aggregation operations allow us to process data records and return computed results. Aggregation operations group values from multiple documents together, we can perform a variety of operations on the grouped data to return a single result. Spring Data Mongo makes the usage of this feature from your Java application very easy.

Given the (very uncommon) collection "factories" with the following documents:

{ "_id" : 1, "name" : "bicycle_parts", "produces" : ["wheels", "spokes"], "location" : [ 5.1045178, 51.9850405 ], country: "NL"}
{ "_id" : 2, "name" : "car_parts", "produces" : ["wheels", "engines"], "location" : [ 6.6113998, 53.2228623 ], country: "NL" }

Now we want to count the number of factories in the Netherlands that produce all different production parts; we will use an aggregation!
We have to:
- Match all documents with {country:"NL"}. (
- Unwind the "produces" array to be able to group by the different production parts. (
- Group by the unwound "produces" array elements. (

{ $match: { "country":"NL" } },
{ $unwind: "produces" },
{ $group: { _id: "$produces", count: { $sum: 1 } } }

The aggregation result will look like:

{ "_id":"wheels", "count":2 }
{ "_id":"spokes", "count":1 }
{ "_id":"engines", "count":1 }

Aggregations With the Mongo Java Driver

We want to call this from our java application, so let's use the Mongo Java driver api:

DBCollection factories ...
// create our pipeline operations, first with the $match
DBObject match = new BasicDBObject("$match", new BasicDBObject("country", "NL"));
// The $unwind
DBObject unwind = new BasicDBObject("$unwind", "produces");
// Now the $group operation
DBObject groupFields = new BasicDBObject( "_id", "$produces");
groupFields.put("count", new BasicDBObject( "$sum", 1));
DBObject group = new BasicDBObject("$group", groupFields);
// run aggregation
List pipeline = Arrays.asList(match, unwind, group);
AggregationOutput output = factories.aggregate(pipeline);

Nice, now lets change the $match criteria to get all factories within 10 kilometres from a certain point (we use Mongo legacy coordinates here for simplicity):
DBObject nearPoint = new BasicDBObject("$near", [ 5.1945978, 52.9950905 ]);
nearPoint.put("$maxDistance", 10000);
DBObject match = new BasicDBObject("$match", new BasicDBObject("location", nearPoint));

We run the new aggregation and... BANG!

IllegalArgumentException: "result undefined"... That doesn't tell us much... Now what?
(Actually the exact mongo error message is in the aggregation result, but when the Mongo Java driver tries to construct an AggregationOutput object, it throws the IllegalArgumentException with this general error instead of adding the error message from the db... That's why error information gets lost here.)

Spring Data to the rescue

List aggregationOperations = new ArrayList();
aggregationOperations.add(MatchOperation.match(Criteria.where("location").near(new Point(5.1945978, 52.9950905))));
AggregationResults result = mongoTemplate.aggregate(newAggregation(aggregationOperations), "factories", AggregateFactoryResult.class);

Now we get a Spring data exception with an understandable message directly from the mongodb:

org.springframework.dao.InvalidDataAccessApiUsageException: Command execution failed: Error [exception: $near is not allowed inside of a $match aggregation expression]

(Actually Spring Data extracts the mongo error message the right way from the MonoDriver result; does it better that the aggregate function of the driver itself!)
Unfortunately the only solution is to not apply a $near clause in the $match expression, but at least we know WHY our aggregation fails!

Benefits of Spring Data usage

- Typesafe AggregationResults; write your own bean result class (AggregateFactoryResult in the example) and Spring Data does the mapping.
- Prevent typos in mongo operation names like "$match" and "$unwind" because Spring Data provides builders for all mongo operations.
- Understandable exceptions (with the introduction of Mongo Java driver version 3, more understandable exceptions are introduced though).