Auto-incrementing IDs on MongoDB

Posted on . Reading time: 4 mins. Tags: mongodb, python, mongoengine.

In relational databases, it is a common pattern to use auto-incremental integers as IDs for table rows. On MongoDB, if you don't set an _id field on document creation, the document will get an ObjectId value automatically assigned to it as _id.

This is fine in most cases because usually, you don't need to expose that ID to the users of the API (or on the UI presenting that data). But sometimes you might to create an ID that you want to share with the users, because they might need to reference a specific data object during conversations, and ObjectID values are not suited to that. For example, it's easier to say "Hey, invoice INV-204 is still due" than "Hey, invoice 507f1f77bcf86cd799439011 is still due".

One way to accomplish that is using triggers, but that's a feature only available on MongoDB Atlas and not for the self-hosted product.

The alternative is to use a collection as a counter for sequences. I found the concept explained on an old version of the official documentation, but I couldn't find it on the newest version. Anyway, the idea is to have a collection that will be dedicated to keep track of auto-incrementing IDs. For each collection where you want to use an ID of this kind, you will have a document on this collection.

What makes it work is to make sure that we are generating the new value and updating the collection storing it atomically. This is, that they happen on the same step. We want to avoid this scenario:

  1. Assume our current stored latest ID is 41.
  2. Thread A tries to generate a new ID. It checks the latest stored ID, and it increments it by 1 to get 42.
  3. Thread B tries to generate a new ID. It also checks the latest stored ID, which is still 41, so it decides the next one is 42.
  4. Thread A stores 42 as the latest ID.
  5. Thread B stores 42, again, as latest ID. Problem #1: we didn't auto-increment properly.
  6. Thread A stores a document (using the same example as before, an invoice) using that ID, so invoice number 42.
  7. Thread B stores also tries to store a new invoice, using the same ID. This is going to create problem #2, a clash of IDs.

Instead, we want:

  1. Assume our current stored latest ID is 41.
  2. Thread A tries to generate a new ID. It checks the latest stored ID, it increments it by 1 to get 42 and it stores that new value in the database. All these 3 steps (fetch, increment, and store) happen as a single operation.
  3. Thread B tries to generate a new ID. It also checks the latest stored ID, which is now 42, so it decides the next one is 43. Like before, it will fetch, increment, and store the new value in a single operation.
  4. Thread A stores a document using that ID, so invoice number 42.
  5. Thread B stores also tries to store a new invoice, which will be invoice 43.

The following is how to implement this solution with Python using MongoEngine. Specifically, to create invoice IDs as INV-{invoice_number}. The solution can easily be ported to PyMongo, but I like to use MongoEngine to help with data validation and treat data as objects and not as dictionaries.

from mongoengine import Document, fields

# This is the collection that will store sequences and their values
class Sequence(Document):
    id = fields.StringField(primary_key=True)
    value = fields.IntField(required=True, default=0)

# This function is used to generate and store a new value for a given sequence
def get_next_id(sequence_name: str) -> int:
    sequence = Sequence.objects(id=sequence_name).modify(
        upsert=True, new=True, inc__value=1
    )
    return sequence.value

# Function to generate invoice IDs
def generate_invoice_id() -> str:
    invoice_number = get_next_id("invoice")
    return f"INV-{invoice_number}"

# Our example Invoice class leveraging the autogenerated invoice ID
class Invoice(Document):
    id = fields.StringField(primary_key=True, default=generate_invoice_id)
    description = fields.StringField(required=True)
    dollar_amount = fields.Decimal128Field(required=True)

# Sample code to create and save a new invoice    
new_invoice = Invoice(description="Writing a blog post", dollar_amount=1000)
new_invoice.save()

Happy coding!