ORMs have been a huge time saver for society. They line up pretty well with the "enterprise"-y software most of us write. They eliminate a good class of potential errors involved in marshalling data through SQL. The amount of things only expressible in SQL has gone way down over the years.

There is a bit of a danger in how ORMs get used. The classy nature of ORMs make us reach for seductively simple OOP tools. But these tools cut us sharply, especially when working in a language without much static checking.

I want to show you a safer way of working with ORMs. Ones that rely on some more basic primitives, but that will avoid many common mistakes that I have experienced in the past with the standard ways.

Let's say you're building a simple Django messaging app:

class Message(models.Model):
    date = models.DateField(default=datetime.now)
    sender = models.ForeignKey(User)
    receiver = models.ForeignKey(User)
    body = models.TextField()

So you have this raw model that gives you a nice interface to the database to create messages, and query for messages.

You could write pages to send messages and view them:

 def inbox_view(request):
     messages = Message.objects.filter(receiver=request.user)
     return render( 'inbox.html', { 'messages': messages })

  def send_message_view(request, receiver_id):
     receiver = User.objects.get(username=receiver_id)
     # create the message
     Message.objects.create(
         sender=request.user,
         receiver=receiver,
         body=request.POST
     )

     return redirect(inbox_view)

You release your app and it mostly works. But then the first feature request comes in: users want to be notified by email when someone sends them a message.

We'll just add a function call to the controller that sends messages:

  def send_message_view(request, receiver_id):
      ...
      send_email(receiver.email, "You have a new message!")
      ....

Another feature request comes in, to bulk send messages to many users:

  def send_bulk_message_view(request):
      ...
      for receiver in receiver_list:
          Message.objects.create(
             sender=request.user,
             receiver=receiver,
             body=message_body
          )

The straightforward way to write this, but we missed a spot! We forgot to add notifications!

Clearly we should be sending notifications no matter what the sending method is.

The usual ORM approach is to override the saving mechanism in your model. No matter where you create your messages, you should send a notification:

class Message(models.Model):
    ...
    def save(self, *args, **kwargs):
       super(Message, self).save(*args, **kwargs)
       send_email(self.receiver.email, "You have a new message!")

Awesome, all creations should be caught, and you'll always send the notification.

You add a feature to edit the messages.

Editing the message calls the model save method, so now a user gets a notification each time the message is edited. Time to fix that as well:

 class Message(models.Model):
    ...
    def save(self, *args, **kwargs):
        # no primary key yet => creation, not edit
        is_creation = self.pk
        super(Message, self).save(*args, **kwargs)
        if is_creation:
            send_email(self.receiver.email, "You have a new message!")

You decide to backfill some messages from a legacy system, by creating a bunch of new Messages. Your users get a bunch of spam notifications for messages from 3 years ago. Not super intended.

The problem with overriding is that it doesn't follow good OOP principles (the "L" in SOLID). If some code is working with Models, and has Messages, it should be able to call save without a bunch of unexpected side-effects. The standard advice of overriding to change behaviour has a high probability of unintended consequences.

Your top-level application code should almost never be directly calling ORM methods. A blunt guideline, but one that pays off quickly.


An alternative approach:

 def send_message(sender, receiver, message):
        return Message.objects.create(sender=sender, receiver=receiver, message=message)

 def lookup_user(user_id):
        return User.objects.filter(username=user_id)

 def send_message_view(request, receiver_id):
      receiver = lookup_user(receiver_id)
      send_message(request.user, receiver, request.POST)
      return redirect(inbox_view)

Instead of directly accessing the ORM, we write a (thin) layer over it to represent our business logic. This seems obvious, but we're often tempted to not do so for simple things. But even for simple things, noting the intent is useful, so going the extra mile with these methods is worth it.

When message.save() is run, it's impossible for the code to know the intent, or the source of the request. But if send_message is called, then it's much clearer. And, perhaps more importantly, the range of inputs to deal with is much more restricted.

If you go with this, implementing bulk sending and messaging takes little time, and will be right the first time:

def notify_about_message(msg):
    # TODO check if user disabled email notifications
    send_email(msg.receiver, "You have a message!")

def send_message(sender, receiver, message):
    message = Message.objects.create(sender=sender, receiver=receiver, message=message)
    notify_about_message(message)

def bulk_send_message_view(request):
    ...
    for receiver in receiver_list:
        send_message(request.user, receiver, message_body)

It feels annoying, because you're writing a bunch of one-line functions. It feels like over-engineering.

But this is also great documentation. And you're giving yourself clean entry points for many more features. This model works especially well for the sort of CRUD enterprise apps a lot of us are building.

My recommendation is to fight the urge to not be abstract, and to try to make your business logic layer protect the top layer from the ORM. This logic can also extend to calling other third-party libraries as well. You will end up with an application that is much more refactorable than the alternative.