Exercise 2 — Reshaping Documents
Topics:
$addFields,$project,$unwind,$out,$replaceRootReading before you start: 04 - Document Stages (sections on
$unwindand$out), 05 - Structure StagesDatabase:
aggregation_lab— see Setup - Database
Part A — Adding Fields with $addFields
Task A1
Add a field ratingCount to each employee that contains the number of ratings they have. Use $size.
Task A2
Add a field skillCount to each employee that counts how many skills they have. Also add a field seniorEmployee set to the constant value false.
Task A3
For each employee, add a field emailDomain that copies the value of the email field.
Note: We’re just copying the full email here — extracting the domain part requires string expressions we haven’t learned yet. That’s fine.
Task A4
For each project, add a field teamSize that counts how many members are in the team array. Also add a field milestoneCount with the number of milestones.
Task A5
For each project, add the following computed fields:
costDifference: the difference betweenestimatedCostandactualCost(use$subtract)totalReviewScores: the sum of all review ratings for the first employee in theteamarray — this is tricky, skip if stuck and come back later.
Part B — Projecting Fields with $project
Task B1
From the employees collection, create a pipeline that outputs only name, role, and salary for each employee. The _id field should be excluded.
Task B2
From the projects collection, project only the title and status. Additionally, rename the field estimatedCost to budget.
Task B3
Create an employee “summary card” by projecting:
name(keep as-is)role(keep as-is)departmentrenamed fromdepartment_idnumberOfSkillscomputed using$sizeonskillstopRatingcomputed using$maxonratingsaverageRatingcomputed using$avgonratings
Exclude _id.
Task B4
From the orders collection, project:
datestatusitemCount: the number of items in theitemsarraycustomerrenamed fromcustomer_id
Part C — Unwinding Arrays
Task C1
Unwind the skills array in the employees collection. How many documents do you get? Why is this number different from the original 12?
Task C2
Unwind the items array in the orders collection. Project only the date, status, and the unwound items field. How many documents are in the result?
Task C3
Unwind the milestones array in the projects collection. Then filter ($match) to keep only milestones where completed is true. Project title (from the project) and the milestone name and deadline.
Task C4
Unwind the ratings array of employees. Then sort the unwound documents by the individual ratings value in descending order. Limit to 10 results. Project name and ratings.
Part D — Combining Structure Stages
Task D1
Find all active developers. Add a field salaryLevel that is the string "HIGH" if salary is greater than 50000, or "STANDARD" otherwise.
Hint: You haven’t learned
$condyet — instead, solve this by running two separate pipelines: one that matchessalary > 50000and one that matchessalary <= 50000, and observe the results.
Task D2
Build a pipeline on projects:
- Match only projects with status
"IN_PROGRESS" - Add a field
budgetUsedPercentby dividingactualCostbyestimatedCostand multiplying by 100 — use$multiplyand$divide - Project only
title,budgetUsedPercent, andpriority - Sort by
budgetUsedPercentdescending
Task D3
Build a pipeline on tasks:
- Match only tasks with status
"DONE" - Project
title,hours, andproject_id - Sort by
hoursdescending - Save the result to a new collection called
completed_task_reportusing$out
After running it, verify by querying: db.completed_task_report.find()
Task D4
Unwind the items array in orders. Then add a field lineTotal computed as items.price * items.quantity (use $multiply with "$items.price" and "$items.quantity"). Project date, items.product, items.quantity, items.price, and lineTotal. Sort by lineTotal descending.
Part E — Creative Challenge
Task E1
Using $replaceRoot, take the log_entries collection and create a pipeline that replaces each document’s root with just the metadata sub-object. What happens for entries where metadata contains different fields?
Task E2
Design a “project health report”. Using any combination of stages you’ve learned so far ($match, $sort, $limit, $skip, $addFields, $project, $unwind, $out), create a pipeline on the projects collection that:
- Filters to only in-progress or planning projects
- Computes useful metrics (team size, milestone count, budget usage)
- Presents only the most relevant fields
- Saves the output to a new collection
There is no single correct answer — design the report the way you think would be most useful. Write a brief comment at the top explaining what your report shows.
Reflection:
- What is the difference between
$addFieldsand$project? When would you choose one over the other?- What happens to the total document count after
$unwind? Why?- Why must
$outalways be the last stage?