MongoDB Indexing for Performance
Last month I had to solve a critical performance issue in our production MongoDB database. What started as routine query optimization turned into an adventure with compound indexes, background indexing, and unexpected traffic interruptions. Here's what I learned.
The Warning Signs
It began with random slowdowns in our user dashboard. Response times would jump from 200ms to over 5 seconds during peak hours. At first I suspected network issues, but our monitoring showed the bottleneck was in database query time.
Looking at our MongoDB logs, I saw these warnings multiplying:
2023-06-01T15:32:45.123+0000 I COMMAND [conn3] command users.profiles command: find { find: "profiles", filter: { "lastActive": { $gt: ISODate("2023-05-01") }, "region": "us-west" }, sort: { lastActive: -1 } } planSummary: COLLSCAN keysExamined:0 docsExamined:1250000 cursorExhausted:1 numYields:9813 nreturned:215 reslen:1421 locks:{ Global: { acquireCount: { r: 19628 } }, Database: { acquireCount: { r: 9814 } }, Collection: { acquireCount: { r: 9814 } } } storage:{ data: { bytesRead: 650MB, timeReadingMicros: 15250000 } } protocol:op_msg 5250msThe key red flag was planSummary: COLLSCAN - a full collection scan. MongoDB was reading through our entire profiles collection (1.25 million documents) to find 215 matching records. No wonder it was taking 5 seconds!
First Attempt: Simple Index
My first instinct was to add a simple index on the filtered field:
db.profiles.createIndex({ "lastActive": 1 })After running this command, I checked if queries were using the index:
db.profiles.find({
"lastActive": { $gt: ISODate("2023-05-01") },
"region": "us-west"
}).sort({ lastActive: -1 }).explain("executionStats")The explain output was disappointing:
{
"executionStats": {
"executionSuccess": true,
"nReturned": 215,
"executionTimeMillis": 2100,
"totalKeysExamined": 450000,
"totalDocsExamined": 450000,
"executionStages": {
"stage": "FETCH",
"inputStage": {
"stage": "IXSCAN",
"keyPattern": { "lastActive": 1 },
// ...additional details omitted
}
}
}
}Better, but still 2.1 seconds - far from ideal. MongoDB was using the index, but still examining 450,000 documentsbecause it had to filter the "region" field after retrieving documents from the index.
The Problem with Sorting Direction
I realized another issue - my index was ascending (1), but my query was sorting in descending order (-1). This meant MongoDB had to read the entire result set into memory and then sort it in reverse.
I created a new index to match the sort direction:
db.profiles.createIndex({ "lastActive": -1 })This improved things to about 1.8 seconds - better, but still not good enough for our dashboard.
Compound Index: The Right Approach
After more research, I realized what we needed was a compound index that covered both filter fields and matched the sort order:
db.profiles.createIndex({
"region": 1,
"lastActive": -1
})With this index, MongoDB could filter by region first (an exact match), and then filter and sort by lastActive within that subset in a single operation.
I ran explain again and saw dramatic improvements:
{
"executionStats": {
"executionSuccess": true,
"nReturned": 215,
"executionTimeMillis": 42,
"totalKeysExamined": 215,
"totalDocsExamined": 215,
// ...
}
}From 5 seconds down to 42ms - a 100x speed improvement! MongoDB was now examining only the 215 documents it needed to return.
The Background Indexing Disaster
So I had a solution, but I made a crucial mistake in deploying it. I ran this index creation directly on our production database during business hours, without setting the background option:
// DON'T DO THIS in production without background:true
db.profiles.createIndex({ "region": 1, "lastActive": -1 })Within seconds, our Slack channels exploded with alerts. The indexing operation had locked the collection, blocking ALL write operations to the profiles collection. Our users couldn't update their profiles, and new user registrations were failing.
Warning: In MongoDB 4.2 and earlier, creating an index without the background option will lock the collection for the duration of the index build. Always use background indexing in production!
I had to kill the indexing operation and restart from scratch with background indexing enabled:
db.profiles.createIndex(
{ "region": 1, "lastActive": -1 },
{ background: true }
)This time, users didn't notice any interruption. The index took longer to build (about 15 minutes instead of 5), but it was a fair trade-off for not disrupting the service.
Tuning the Index to Match Query Patterns
With the immediate problem solved, I analyzed our query patterns more carefully. I discovered we had several variations of queries against the profiles collection:
// Query 1: Filter by region, sort by lastActive
db.profiles.find({ "region": "us-west" }).sort({ "lastActive": -1 })
// Query 2: Filter by region and status, sort by lastActive
db.profiles.find({
"region": "us-west",
"status": "active"
}).sort({ "lastActive": -1 })
// Query 3: Filter by region and date range, sort by lastActive
db.profiles.find({
"region": "us-west",
"lastActive": { $gt: ISODate("2023-05-01") }
}).sort({ "lastActive": -1 })To support all these query patterns efficiently, I created a more comprehensive compound index:
db.profiles.createIndex(
{
"region": 1,
"status": 1,
"lastActive": -1
},
{ background: true }
)With this index, MongoDB can efficiently handle all three query patterns:
- Query 1 uses the first and third fields (region, lastActive)
- Query 2 uses all three fields
- Query 3 uses the first and third fields with a range condition on lastActive
The Key Insight: Field Order Matters
The key insight was understanding the order of fields in a compound index matters significantly:
- Equality fields first (exact matches like region, status)
- Sort fields next (in the correct sort direction)
- Range fields last (like date ranges)
Results and Lessons Learned
After implementing the optimized compound index, our dashboard query times dropped from 5 seconds to consistently under 50ms, even during peak traffic. User complaints about slowness disappeared, and our server load decreased significantly.
The key lessons I learned from this experience:
- Always use explain() to understand query performance - don't guess what's happening
- Match index order and direction to your queries - consider both filtering and sorting
- Use compound indexes for complex queries - with equality conditions first, then sort fields, then ranges
- Always build indexes in the background in production - unless you enjoy late-night emergency calls
- Monitor index size - each additional index increases storage requirements and slows writes
After our dashboard fix, we conducted a comprehensive review of all our MongoDB queries and indexes. We found and optimized several other slow queries and removed unused indexes that were slowing down write operations.
The final result was a more responsive application and happier users - well worth the indexing adventure!