3 MongoDB Multi-Tenant SaaS Mistakes (Production Lessons)

I spent the last ten months as the lead engineer on a multi-tenant booking SaaS. Node.js, Mongoose, MongoDB Atlas, deployed on Cloud Run. The product is live with real tenants and real bookings.

It also shipped with three database mistakes I made along the way. This post is the honest version of what I’d tell myself on day one — and what every developer building a multi-tenant SaaS in MongoDB should think about before model number three.

Quick context

The project: a multi-tenant booking platform built for a client. Modeled loosely on field-service SaaS like fieldd.co.
Stack: Node.js + Express + Mongoose + MongoDB Atlas, deployed on Cloud Run.
Why MongoDB and not MySQL: the product needed real-time features — live booking updates, Socket.io for worker dashboards, instant push notifications — and the client’s brief was explicit about building on the latest technology stack. Node + Mongo fit both.
Scale at the time of writing: roughly ten months in production, multiple tenants, active bookings.
My role: lead engineer. Schema, indexes, query patterns, the entire database layer was mine to design.

I came into this project from a Laravel and MySQL background. That matters — because the mistakes you’re about to read are the ones I made because my Laravel instincts didn’t translate cleanly to a framework that doesn’t hold your hand. MongoDB and Mongoose ask you to make every safety decision yourself. Laravel makes most of them for you by default.

Here are the three I got wrong.

Mistake #1: I trusted myself to remember the tenant filter on every query

In a multi-tenant app, every read and every write has to be scoped to the current tenant. In MongoDB, that means every find, findOne, and aggregate call needs a { vendorId: <currentVendor> } clause. Always. Without exception.

There is no row-level security in MongoDB. Mongoose has no built-in tenant scope. The rule lives in your head and your code review process.

What went wrong

The rule leaked. More than once.

A new endpoint gets shipped. Someone writes Model.findById(req.params.id) without an additional tenant filter. The route goes live. Customer A’s request pulls up customer B’s record because the IDs are guessable, scraped from a stale URL, or simply iterated through.

Looking back at the project’s commit history, multiple “fix multi-tenant data leak” patches landed across different models over the first several months. Different developers. Same mistake. Same shape.

Why it kept happening

“Always include the tenant filter” is not a security control. It’s a hope.

The codebase had no schema-level scope, no Mongoose middleware, no test that fails when the filter is missing. The rule lived in a CLAUDE.md-style internal doc. New developers don’t read internal docs before writing their first query — they read the existing code, copy the closest-looking pattern, and ship.

Every time the rule leaked, the answer was “we’ll be more careful.” That answer doesn’t scale. The fifth time the same shape of bug shows up, you have to admit that “be more careful” was never the fix.

The fix: a Mongoose query plugin that fails loudly

The cleanest defense is a plugin that throws when a tenant-scoped model is queried without a tenant filter. The escape hatch is explicit, not implicit.

// plugins/tenantScope.js
function tenantScopePlugin(schema) {
  const tenantOps = [
    'find', 'findOne', 'findOneAndUpdate',
    'updateOne', 'updateMany',
    'deleteOne', 'deleteMany',
    'countDocuments'
  ];

  schema.pre(tenantOps, function () {
    if (this.options.bypassTenant) return;

    const filter = this.getFilter();
    const hasTenant = filter.vendorId || filter.vendor;

    if (!hasTenant) {
      throw new Error(
        `Tenant filter required on ${this.model.modelName}. ` +
        `Pass { vendorId } or set { bypassTenant: true } explicitly.`
      );
    }
  });
}

module.exports = tenantScopePlugin;

Apply it only to tenant-scoped models — platform-level models like Industry, Module, Square opt out by design.

// models/booking.model.js
const tenantScopePlugin = require('../plugins/tenantScope');

BookingSchema.plugin(tenantScopePlugin);

This fails the moment a developer forgets, in development, before the code reaches a code review let alone production. The escape hatch (bypassTenant: true) is loud enough that anyone using it has to consciously justify it.

If your project has been running for months without this, don’t try to bolt it onto every model in one sprint. Wire it into new models as you create them, and migrate existing models opportunistically when you touch them for other reasons. The bleeding stops fastest where new code is being written.

The Laravel parallel

This is what global scopes are for. A BelongsToTenant trait that calls addGlobalScope on every tenant-scoped model fails the same way at the model layer.

The Laravel default is still “remember to use the trait” — which is also a hope. The improvement is that traits and global scopes are normal Laravel idioms, so engineers reach for them automatically. In Mongoose, nothing automatic exists. You have to build the safety net yourself.

Mistake #2: I named the tenant column two different things

This is the one that bit hardest, and the one I’d warn every team about first.

Older models in the codebase declared the tenant reference as vendor: ObjectId. Newer models declared it as vendorId: ObjectId. Both were valid Mongoose. Both shipped. Neither was wrong on its own.

The problem only shows up when you write a query.

Why it became a real problem

Querying { vendor: id } against a vendorId model returns zero rows. No error. No warning. The query succeeded — there’s just nothing in it.

This is a silent failure mode that’s especially nasty in MongoDB. SQL would surface a column-not-found error and you’d see a stack trace. MongoDB cheerfully runs the query, finds nothing matching vendor because that field doesn’t exist on these documents, and returns an empty result set.

It bit me hardest in the services module and the service-area module. Both got built in the older naming era — they used vendor. Then a refactor on adjacent models switched to vendorId, and a new endpoint that joined data across services and the newer models accidentally used vendorId everywhere. The query returned empty. The frontend showed “no services found.” The bug wasn’t in the controller or the route — it was in the column name. It took longer than it should have to find, because nothing in the response said “you’re querying the wrong field.” It just said nothing.

That single afternoon of debugging cost more than an hour-long meeting on day one to lock the naming convention would have.

The cost, beyond the bug

Even after the immediate bug was fixed, the cost compounded:

Every developer joining the project had to grep the model definition before writing any query against an unfamiliar collection.
Code reviews carried an extra “is this the vendor model or the vendorId one?” check on every PR that touched the database.
A few production bugs traced directly to the wrong column name.

This is what a Laravel/MySQL engineer might call schema drift, but it’s worse in a document store because there’s no information_schema you can query to remind yourself of the truth.

The fix

Pick one. Document it once. Apply the rule to new models from the next commit forward.

Renaming existing collections is pure churn — schema migration across a live database, every query in the codebase touched, risk of breaking webhooks and integrations. Not worth it.

Stop the bleeding by deciding the rule before model number twelve.

The right move would have been a coding-standards entry on day one:

All tenant references on a Mongoose schema use vendorId: ObjectId. Indexed. Required. No exceptions, no abbreviations, no vendor shortcut on “this one model that’s special.”

If you’re starting a multi-tenant project today, write that line into your README before you write your first model. Three words pinned in the repo would have saved me a day of debugging and a paragraph of this blog post.

The Laravel parallel

Same lesson, different flavor. tenant_id vs team_id vs account_id is the Laravel/MySQL version. You see it most often in projects that started as single-tenant and grew into multi-tenant — early models reference user_id directly, later models add a team_id, and queries that joined across the two get confusing fast.

Pick one in week one. Write it down. Apply forever.

Mistake #3: I thought transactions were “for inventory”

The inventory service in this project uses MongoDB transactions correctly. startSession, withTransaction, abort on error. Stock decrement, stock increment, transfers — all wrapped properly. Textbook.

I skipped them everywhere else.

Where it bit

Booking creation writes to four collections in sequence:

Booking — the booking record itself
Customer — created if the customer is new
NotificationLog — the entry that drives confirmation SMS and email
VendorPlan — decrement the tenant’s plan usage counter

None of it was wrapped in a session. Four independent writes, each succeeding or failing on its own.

When everything works, you don’t see the problem. The Cloud Run instance handles the request, all four writes commit, the customer gets their confirmation, and the day moves on.

The problem shows up when the instance dies mid-request. Cloud Run recycles containers. Network blips. Functions time out. Any of these can drop the request between write number one and write number four.

The booking exists. The notification entry doesn’t. The customer’s confirmation never sends. The audit log doesn’t reflect the failure. A support ticket arrives a day later: “I made a booking and never got a confirmation.” Nobody can reproduce it because by the time anyone investigates, the failure-case state has already drifted out of memory.

The same shape repeats on the payment flow:

Transaction created — the payment record
Booking.payment.status updated to paid
Customer.totalSpend incremented

If the request dies after the first write and before the rest, your books and your bookings disagree about whether the customer paid. The reconciliation conversation you have with the client a month later is not fun.

Why I skipped transactions

Three reasons, in roughly the order they were wrong:

“Transactions are slow.” For short writes against indexed collections in the same replica set, the overhead is real but small — a few milliseconds. Not a reason to skip them.
“MongoDB doesn’t really do transactions.” It does. Multi-document ACID transactions have been generally available since MongoDB 4.0, and Mongoose’s withTransaction retries on transient errors automatically. The mental model from older Mongo content — “this is a document store, transactions aren’t really a thing” — is out of date.
“It worked when I tested it.” This is the honest one. Transactions don’t matter when nothing fails. They matter the one time in a thousand when something does, and that’s exactly the case I never tested.

The fix

Wrap every multi-collection write that has to land together in a Mongoose session:

const session = await mongoose.startSession();

try {
  await session.withTransaction(async () => {
    await Booking.create([bookingData], { session });

    await Customer.updateOne(
      { _id: customerId, vendorId },
      { $inc: { totalBookings: 1 } },
      { session }
    );

    await NotificationLog.create([logData], { session });

    await VendorPlan.updateOne(
      { vendorId },
      { $inc: { 'usage.bookings': 1 } },
      { session }
    );
  });
} finally {
  await session.endSession();
}

Three rules to internalize alongside the syntax:

External API calls stay outside the session. Twilio, SendGrid, Stripe webhook responses — none of these go inside withTransaction. Holding a Mongo session open across a network call to a third party is its own antipattern. Send the SMS after the transaction commits, not during it.
Use withTransaction, not raw startTransaction / commitTransaction. Mongoose’s helper handles transient retry automatically — saves you a try/catch around TransientTransactionError.
endSession always goes in finally. Sessions leak otherwise.

The Laravel parallel

DB::transaction(function () { ... }). Same idea, fewer lines.

If two writes have to succeed together for the data to make sense, they go in the closure. Eloquent does the wrapping for you, deadlock retry is built in, and external API calls outside the closure is the same rule. The decision is identical; the syntax is shorter.

What I’d do differently on day one

If I were starting this project over with what I know now, three small commits in the first week would have prevented every mistake in this post:

A tenant-scope plugin written before the second model is created. It’s an afternoon of work and it makes the tenant filter rule mechanical instead of cultural.
A short coding-standards file pinned in the repo before model number three. Two paragraphs is enough. The tenant column name, the soft-delete strategy, the unique-constraint pattern.
Transactions wrapped around any flow that touches more than one collection, from the very first booking-creation route. The cost is tiny. The cost of not having them is invisible until it isn’t.

None of these are large changes. Each is an afternoon. The cost of skipping them compounds for the entire life of the project.

What this taught me about stack choice

I came into this project from Laravel and MySQL, and I expected MongoDB to be the source of new problems.

It wasn’t. Every mistake in this post is fully possible in Laravel and MySQL. A developer can forget a tenant filter in Eloquent. Two engineers can name the tenant column differently across two migrations. A multi-table write can ship without DB::transaction wrapping it.

The difference is what the framework does about it. Laravel’s global scopes, Eloquent’s conventions, and DB::transaction make the right path the obvious one. MongoDB and Mongoose mostly leave it to you. That’s not a flaw — it’s a deliberate philosophical choice. The cost of that choice is what this post is about.

Knowing that tradeoff before you pick the stack is the actual lesson. Real-time features pulled this project toward Node and Mongo, and I’d make the same call again. But I’d open the README on day one with the three rules above, before I wrote a single schema.

Things I got right (a teaser for the follow-up)

The same project has a few things worth showing off — composite indexes that lead with the tenant column on every hot collection, money fields typed as Number and never String, unique constraints scoped per tenant at the database layer, and webhook routes mounted before the JSON body parser so HMAC verification gets raw bytes.

That’s the next post.

If you’re shipping a multi-tenant SaaS — in any stack — and want a second pair of eyes on the schema before it becomes irreversible, hiremuneeb@gmail.com or DM on LinkedIn. I review schema designs before they ship and after they break — same playbook, lower stakes when it’s the first one.

Categorized in:

MongoDB

3 Database Mistakes I Made Building a Multi-Tenant SaaS in MongoDB

Quick context

Mistake #1: I trusted myself to remember the tenant filter on every query

What went wrong

Why it kept happening

The fix: a Mongoose query plugin that fails loudly

The Laravel parallel

Mistake #2: I named the tenant column two different things

Why it became a real problem

The cost, beyond the bug

The fix

The Laravel parallel

Mistake #3: I thought transactions were “for inventory”

Where it bit

Why I skipped transactions

The fix

The Laravel parallel

What I’d do differently on day one

What this taught me about stack choice

Things I got right (a teaser for the follow-up)

Comments

Leave a Reply Cancel reply

Previous Article

Laravel 13: When to Upgrade and What’s Actually Worth It

Next Article

Why I Picked Sarvam Vision Over OpenAI for DocuBharat

How to Handle Timezones in a Booking System: The Architecture That Survives DST

The Indian Address Problem: Turning opposite community centre, gali no 3, uttam nagar, delhi Into a Precise Doorstep Pin

AddressFix: Clean Messy Indian Addresses with Mappls

Delhivery Maps + Naksha LLM: A Deep Dive Into India’s First Delivery-Trained Mapping Stack

Press ESC to close

Or check our Popular Categories...

Quick context

Mistake #1: I trusted myself to remember the tenant filter on every query

What went wrong

Why it kept happening

The fix: a Mongoose query plugin that fails loudly

The Laravel parallel

Mistake #2: I named the tenant column two different things

Why it became a real problem

The cost, beyond the bug

The fix

The Laravel parallel

Mistake #3: I thought transactions were “for inventory”

Where it bit

Why I skipped transactions

The fix

The Laravel parallel

What I’d do differently on day one

What this taught me about stack choice

Things I got right (a teaser for the follow-up)

Comments

Leave a Reply Cancel reply

Related Articles

Previous Article

Next Article