Exercise 3 — Grouping and Joining

Topics: $group, $bucket, $lookup, accumulator operators ($sum, $avg, $min, $max, $push, $addToSet)

Reading before you start: 06 - Relationship Stages, 07 - Aggregation Stages, 13 - Accumulator Operators

Database: aggregation_lab — see Setup - Database

Part A — Grouping with `$group`

Task A1

Group all employees by their role. For each role, count how many employees have that role.

Expected output format:

{ _id: "DEVELOPER", count: 6 }
{ _id: "LEAD", count: 2 }
// ...

Task A2

Group employees by department_id. For each department, compute:

employeeCount: the number of employees
avgSalary: the average salary
maxSalary: the highest salary

Task A3

Group tasks by status. For each status, compute:

taskCount: total number of tasks
totalHours: sum of all hours

Sort by totalHours descending.

Task A4

Group orders by paymentMethod. For each method, count the number of orders and collect all unique status values using $addToSet.

Task A5

Group tasks by project_id. For each project, collect all task titles into an array using $push. Also compute totalHours.

Sort by totalHours descending.

Part B — Bucketing with `$bucket`

Task B1

Bucket employees by salary into these ranges:

0 – 40,000
40,000 – 55,000
55,000 – 70,000
70,000 – 100,000

For each bucket, output the count and the employee names (using $push).

Task B2

Bucket employees by age into:

20 – 30
30 – 40
40 – 60

For each bucket, compute the average salary and count.

Part C — Joining with `$lookup`

Task C1

Start from the employees collection. Use $lookup to join each employee with their department from the departments collection.

from: "departments"
localField: "department_id"
foreignField: "_id"
as: "department"

Then use $unwind to turn the department array into a single object.

Finally, $project only the employee name, role, salary, and the department name (as departmentName).

Task C2

Start from the tasks collection. Use $lookup to join each task with its project.

from: "projects"
localField: "project_id"
foreignField: "_id"
as: "project"

Unwind the project. Then $project the task title, status, hours, and the project’s title (as projectTitle).

Task C3

Start from the orders collection. Use $lookup to join each order with its customer.

from: "customers"
localField: "customer_id"
foreignField: "_id"
as: "customer"

Unwind the customer. Project the order date, status, paymentMethod, the customer name (as customerName), and the customer loyaltyTier.

Sort by date ascending.

Part D — Combining Grouping and Joining

These tasks require you to combine $lookup with $group and other stages you’ve learned previously.

Task D1

Which department has the highest total salary cost?

Group employees by department_id, summing their salaries as totalSalary
Sort by totalSalary descending
Limit to 1

Task D2

How many tasks does each project have, and what is the project title?

Group tasks by project_id, counting as taskCount
Use $lookup to join with projects to get the project title
Unwind the project
Project projectTitle, taskCount
Sort by taskCount descending

Task D3

What is the total revenue per customer?

Unwind the items array in orders
Add a lineTotal field (price * quantity)
Group by customer_id, summing lineTotal as totalSpent
Use $lookup to join with customers
Unwind the customer
Project customerName, totalSpent
Sort by totalSpent descending

Task D4

Create a department overview that shows for each department:

Department name (from departments collection)
Number of employees
Average employee salary
Total employee salary

Start from employees, group by department_id, then use $lookup to get the department name.

Part E — Creative Challenge

Task E1

Log analysis: Group the log_entries by level (INFO, WARNING, ERROR). For each level, count the entries and collect all unique service names. Sort by count descending.

Task E2

Order analytics dashboard: Design a pipeline that answers the question: “Which products are ordered most frequently, and what is their total revenue?”

You will need to: unwind items, group by product, compute both count and total revenue, then sort.

Task E3

Cross-collection report: Create a comprehensive “team productivity report” that shows, for each employee:

Their name and role
Their department name
How many tasks are assigned to them
Total hours of their assigned tasks

This requires you to think about which collection to start from and how to join and group the data. There are multiple valid approaches — pick one and implement it.

Save the result to a collection called team_productivity_report.

Reflection:

What is the difference between $push and $addToSet in a $group stage?

After $lookup, why do you usually need $unwind?

What happens in a $lookup when there is no matching document in the foreign collection?

Why is _id: null useful in a $group stage?

Deep Thought

Explorer

Exercise 3 - Grouping and Joining

Exercise 3 — Grouping and Joining

Part A — Grouping with `$group`

Task A1

Task A2

Task A3

Task A4

Task A5

Part B — Bucketing with `$bucket`

Task B1

Task B2

Part C — Joining with `$lookup`

Task C1

Task C2

Task C3

Part D — Combining Grouping and Joining

Task D1

Task D2

Task D3

Task D4

Part E — Creative Challenge

Task E1

Task E2

Task E3

Graph View

Table of Contents

Deep Thought

Explorer

Exercise 3 - Grouping and Joining

Exercise 3 — Grouping and Joining

Part A — Grouping with $group

Task A1

Task A2

Task A3

Task A4

Task A5

Part B — Bucketing with $bucket

Task B1

Task B2

Part C — Joining with $lookup

Task C1

Task C2

Task C3

Part D — Combining Grouping and Joining

Task D1

Task D2

Task D3

Task D4

Part E — Creative Challenge

Task E1

Task E2

Task E3

Graph View

Table of Contents

Part A — Grouping with `$group`

Part B — Bucketing with `$bucket`

Part C — Joining with `$lookup`