MongoDB Aggregation Pipeline
What is Aggregation?
When working with databases in real-world applications, you often need more than simple CRUD operations. You need to analyze, transform, and summarize data across documents and collections. This is where aggregation comes in.
MongoDB provides three mechanisms for aggregating data:
| Mechanism | Scope | Complexity | Use Case |
|---|---|---|---|
| Query Methods | Single collection | Low | Simple counts, distinct values |
| Aggregation Framework | Multiple collections | Medium–High | Complex transformations, joins, analytics |
| Multiple collections | Very High | Deprecated since MongoDB 5.0 — use the Aggregation Framework instead |
This guide focuses on the Aggregation Framework, which is the recommended approach for all aggregation tasks. Simple query methods are covered briefly as a starting point.
Note: MapReduce has been deprecated since MongoDB 5.0 and should no longer be used. The
db.collection.group()method was removed in MongoDB 4.2. All aggregation tasks should use the Aggregation Pipeline instead.
How to Read This Guide
This scriptum is organized from simple to complex. Follow the chapters in order:
Part 1: Foundations
- 01 - Query Methods — Simple aggregation with
count(),distinct(), andgroup() - 02 - The Aggregation Pipeline — Core concept: how pipelines work
- 03 - Sample Data — The example dataset used throughout this guide
Part 2: Pipeline Stages
- 04 - Document Stages — Control which documents flow through:
$match,$sort,$skip,$limit,$unwind,$out - 05 - Structure Stages — Control what each document looks like:
$addFields,$project,$replaceRoot - 06 - Relationship Stages — Join data from other collections:
$lookup - 07 - Aggregation Stages — Group and summarize:
$group,$bucket
Part 3: Expressions
- 08 - Expressions Overview — What expressions are and how they work
- 09 - Arithmetic Operators —
$add,$subtract,$multiply,$divide,$abs - 10 - Comparison Operators —
$eq,$ne,$gt,$lt,$gte,$lte - 11 - Boolean and Control Flow Operators —
$and,$or,$not,$cond,$switch,$cmp - 12 - Array Operators —
$arrayElemAt,$concatArrays,$in,$map,$filter,$reduce,$isArray - 13 - Accumulator Operators —
$sum,$min,$max,$avg,$addToSet,$push - 14 - Variable and Reference Expressions —
$$CURRENT,$expr,$let - 15 - Set Operators —
$setDifference,$setIntersection,$setUnion,$setIsSubset
Quick Reference: All Pipeline Stages
| Stage | Purpose |
|---|---|
$match | Filter documents (like find()) |
$sort | Sort documents |
$skip | Skip first n documents |
$limit | Limit to first n documents |
$unwind | Deconstruct arrays into separate documents |
$out | Write results to a collection (replaces entire collection) |
$merge | Write results to a collection (can insert, update, or replace individual documents — more flexible than $out) |
$addFields | Add new fields to documents |
$project | Select/rename/compute fields |
$replaceRoot | Replace the document root |
$lookup | Join with another collection |
$group | Group by key and aggregate |
$bucket | Group by value ranges |
$count | Count documents in the stream |
Quick Reference: All Expression Operators
| Category | Operators |
|---|---|
| Arithmetic | $add, $subtract, $multiply, $divide, $abs |
| Comparison | $eq, $ne, $gt, $gte, $lt, $lte, $cmp |
| Boolean | $and, $or, $not |
| Control Flow | $cond, $switch |
| Array | $arrayElemAt, $concatArrays, $in, $map, $filter, $reduce, $isArray |
| Accumulators | $sum, $min, $max, $avg, $addToSet, $push |
| Variables | $$CURRENT, $expr, $let |
| Set | $setDifference, $setIntersection, $setUnion, $setIsSubset |