Exercise 1 — Foundations and First Pipeline

Topics: $match, $sort, $skip, $limit, basic pipeline construction

Reading before you start: 01 - Query Methods, 02 - The Aggregation Pipeline, 03 - Sample Data, 04 - Document Stages (sections on $match, $sort, $skip, $limit)

Database: aggregation_lab — see Setup - Database


Part A — Simple Filtering with $match

Write an aggregation pipeline for each task. Use only $match.

Task A1

Find all employees who have the role "DEVELOPER".

Expected result: 5 documents (Anna, Markus, Elena, David, Patrick, Michael)

Task A2

Find all employees who are not active (active: false).

Task A3

Find all employees who earn more than 50,000.

Task A4

Find all projects with priority "CRITICAL".

Task A5

Find all tasks that are currently "IN_PROGRESS".

Task A6

Find all orders that were paid via "PAYPAL" and have the status "DELIVERED".


Part B — Sorting and Pagination

Task B1

Find all employees, sorted by salary in descending order (highest first).

Task B2

Find all employees, sorted by age in ascending order.

Task B3

Find all tasks, sorted by hours descending. Return only the top 5 tasks with the most hours.

Task B4

Find all employees, sorted by salary descending. Skip the first 3 and return the next 4. This simulates page 2 of a paginated list with page size 4.


Part C — Combining Stages

Build pipelines with multiple stages chained together.

Task C1

Find all active employees with the role "DEVELOPER", sorted by salary descending.

Task C2

Find all tasks with status "TODO", sorted by hours descending, and return only the first 3.

Task C3

Find all orders with status "DELIVERED", sorted by date ascending. Skip the first 2 results and return the next 3.

Task C4

Find all projects that are "IN_PROGRESS" and have priority "HIGH" or "CRITICAL". Sort them by estimatedCost descending.

Hint: You can use $in inside $match to check if a field’s value is one of several options.

Task C5

Find all employees hired after January 1st, 2020 (hireDate: { $gt: ISODate("2020-01-01") }) who are active and have a salary below 50,000. Sort by hireDate ascending.


Part D — Exploration Tasks

These tasks are open-ended. Think about what information you want to extract, then build a pipeline to get it.

Task D1

You are preparing a presentation about the company’s workforce. Using only the stages you’ve learned this week ($match, $sort, $skip, $limit), create three different queries that each answer a different interesting question about the employees collection.

Write down your questions and the pipelines you used to answer them.

Task D2

Explore the log_entries collection. Write a pipeline that finds the most recent error entries, sorted with the newest first, limited to 5 results.

Task D3

Imagine you are building a task management dashboard. Write pipelines that would provide:

  1. The 3 most time-intensive TODO tasks
  2. All tasks for a specific project (pick one of the project IDs from the data)
  3. All DONE tasks, sorted by hours ascending

Reflection: After completing the exercises, think about the following:

  • Why does the order of stages matter?
  • What happens if you put $limit before $sort? Try it and observe the difference.
  • When would you use $skip + $limit together in a real application?