Comprehensive guide to Serializers in Django REST Framework

This article takes a close look at Serializers in the Django REST Framework and explains how they work.
Total
0
Shares
A comprehensive guide to Serializers in DRF

This article takes a close look at serializers in the Django REST Framework and explains how they work. It’s the second part of the comprehensive guides series. Here is a quick summary of the whole series:

  • The first part covers the APIView class – the foundation of API endpoints in the Django REST Framework.
  • The second part talks about the Serializer and ModelSerializer classes, and explains their purpose and use.
  • The third part covers GenericAPIView and its sibling classes designed for working with database models. These are the classes you’re going to use most commonly.
  • The fourth part talks about ViewSet class and its subclasses, which allow you to create an entire CRUD set of endpoints with just a few lines of code.
  • The fifth part reverse-engineers the internal mechanics of Django REST Framework and explains what exactly happens during a request.

Pre-requisites and assumptions

  • You have a basic understanding of RESTful APIs.
  • You have a basic knowledge of the Django framework.
  • You have Python 3.6+ installed on your system.
  • You have pip installed and ready to use. Optionally, you have virtualenvvenv or pipenv (my favourite) on your system, and you run all commands within a virtual environment (it’s good practice).
  • You’re continuing the project from the previous part of the series or you created a new empty Django project (feel free to use this cheat sheet as help).

What is Serializer in DRF?

Let’s begin with establishing the concept of serialization in programming. Here is what Wikipedia tells us about it:

In computing, serialization is the process of translating a data structure or object state into a format that can be stored (for example, in a file or memory data buffer) or transmitted (for example, over a computer network) and reconstructed later (possibly in a different computer environment).

To put it simply, to serialize data means to convert it into an intermediary format that can be easily understood by other programs and reconstructed into its original form.

In the context of the web, the most commonly used intermediary data formats are JSON and XML. So it’s easy to assume that DRF serializers convert data to and from these formats. However, this is not exactly the case.

Instead, serializers in DRF work with Python’s native data types. They transform model objects and querysets into a “simplified” representation consisting of strings, integers, booleans, dictionaries, and lists.

This approach allows serializers to remain independent of specific data formats and keep their implementation universal.

TIP: Django REST Framework has special families of classes for converting data to and from intermediary formats. Those are parsers and renderers. We will cover when and how exactly DRF transforms data to and from HTTP-friendly formats like JSON and XML in the last part of this series.

On top of converting data to native data types, the serializers in DRF have other responsibilities: validating user-submitted data and working with database models.

As you will see further in this article, serializers are essential to the Django REST Framework.

TAKEAWAY: Serializers in DRF are responsible for the following:
1. Converting model instances and querysets into Python's native data types.
2. Validating user-submitted data.
3. Creating and updating database models instances.

How to use DRF Serializers?

If you ever worked with the Form class in Django, you will find DRF serializers familiar. In fact, their architecture is based on Django forms.

Let’s study how serializers work by building a simple API endpoint that allows users to register an account.

To define what kind of data our endpoint will accept and return, we will need to create a serializer class with respective fields. Create a new file called serializers.py and add the following code to it:

# We import the entire module to have easy access to multiple things
from rest_framework import serializers

# Note how we inherit from the Serializer class
class UserRegistrationSerializer(serializers.Serializer):
    # The fields here define what data our endpoint will work with
    email = serializers.EmailField(required=True)
    password = serializers.CharField(required=True)

What we did is declared a couple of properties under our class and assigned Field instances to them.

Note that we used different Field classes for each field. For example, for the email field, we used the EmailField class. As the name suggests, this field is designed to work specifically with email addresses and has built-in validation to determine whether a given string is an email address.

TIP: You can find the full list of available serializer fields in the official documentation.

We also specified the required=True parameter on our fields. This means that the user must send those fields, otherwise, the endpoint will return an error.

When inspected closer, the code snippet above reveals an interesting fact – some part of the validation is done by the Field classes directly. This means that the Serializer class is not responsible for all validation, but delegates some of it to the fields.

But let’s continue. Our serializer by itself is not very useful. We need a way to forward user-submitted data to it. This is what APIViews are responsible for as described in the previous part of this series.

Let’s declare one. Open or create views.py and add the following lines to it:

from rest_framework.views import APIView
from rest_framework.response import Response
from rest_framework.permissions import AllowAny
# We import our serializer here
from .serializers import UserRegistrationSerializer

class UserRegistrationAPIView(APIView):
    # Note: we have to specify the following policy to allow 
    # anonymous users to call this endpoint
    permission_classes = [AllowAny]

    def post(self, request, format=None):
        # Pass user-submitted data to the serializer
        serializer = UserRegistrationSerializer(data=request.data)

        # Next, we trigger validation with `raise_exceptions=True`
        # which will abort the request and return user-friendly
        # error messages if the validation fails
        serializer.is_valid(raise_exceptions=True)

        # For now we skip any interactions with the database 
        # and simply show the validated data back to the user
        return Response(serializer.data)

Lastly, we are going to register our endpoint in the urls.py file. Here is how we do it:

from django.urls import path
# Import our view class here
from .views import UserRegistrationAPIView

urlpatterns = [
    # ...other routes are declared here  

    path('api/users/register/',
         UserRegistrationAPIView.as_view(),
         name='user-register'),
]

After restarting the Django server, we can open our endpoint in the Browsable API and submit a test request with the following data:

{
    "email": "wizard@sourcery.blog",
    "password": "TopSecret!"
}

If all is well, you should get a response with 200 OK code and with the same data that we submitted.

But not all requests will contain valid data. Let’s now look into the validation process.

How does the validation work?

DRF provides a lot of flexibility when it comes to validating user-submitted data. There are three main layers of validation in DRF:

  1. Per-field validation run by the Field classes, which we briefly touched upon earlier.
  2. Per-field validation run by the Serializer class – more on that soon.
  3. Per-request validation run by the Serializer class. This one allows us to check multiple fields at once.

Let’s review each layer individually.

Per-field validation run by Field classes

You can try sending an empty request to our endpoint via the Browsable API. What you should see is the error messages denoting that our email and password fields are mandatory:

{
    "email": [
        "This field is required."
    ],
    "password": [
        "This field is required."
    ]
}

Similarly, if you try to submit a faulty email address, you will get an error message detailing that too.

This is nice, but sometimes you want to add a custom validation rule that DRF doesn’t provide out of the box. For example, you might need to validate a phone number.

If this case you can create a custom validator function and pass it via the validators=[] parameter when declaring a field. Here’s how it would look:

class UserRegistrationSerializer(serializers.Serializer):
    # ...email and password fields are declared here
    phone = serializers.CharField(validators=[my_phone_validator_func])

The beauty of this approach is that you can reuse your validator function in other serializers as many times as you want.

We will leave out the implementation of the phone validator to keep this article concise. You can find more information on validators in the official documentation.

Per-field validation run by Serializer class

In some situations, you might want to have a field-specific validation just on one of your endpoints. For example, our user registration endpoint needs to check whether a user account with a submitted email already exists. Creating a custom validator for this might be an overkill.

Luckily, we can declare a method on our serializer class that starts with validate_ followed by the name of the field. For example validate_email. DRF will pick up that method automatically and run it as part of validation.

Such methods always accept two parameters: self and value. The latter is the field value submitted during the request.

Here is what the implementation of our custom validation method for the email field looks like:

# Here we use a special method instead of importing the `User` model
# directly. This will ensure that our code works even if our project 
# were to switch to a custom user model.
from django.contrib.auth import get_user_model

User = get_user_model() # Get reference to the user model here

class UserRegistrationSerializer(serializers.Serializer):
    # ...fields are declared here, minus the phone field

    def validate_email(self, value):
        # It's safe to assume that the value will never
        # be `None`, because our field is required
        if User.objects.exists(email=value):
            raise serializers.ValidationError(
                'A user with this email address already exists.')

        # Note: it's important to return the value at the end of this method
        return value

DRF is smart enough to know that the raised error relates to the email field, therefore the error message will be properly namespaced:

{
    "email": [
        "A user with this email address already exists."
    ]
}

Per-request validation run by Serializer class

Sometimes you need to check multiple fields at once to determine whether a request is valid.

For that, DRF lets us declare a method called validate on our serializer class. This method will always accept two parameters: self and attrs. The latter will contain all submitted (and pre-validated by per-field validators) data.

Let’s modify our user registration serializer to allow for a new scenario. We will let our users sign up with either an email or a username. They can optionally provide both, but at least one must be always provided.

The following several snapshots are showing the changes we’re making. First of all, we change the accepted fields:

class UserRegistrationSerializer(serializers.Serializer):
    email = serializers.EmailField() # Removed `required=True` here
    username = serializers.CharField() # Added a new field
    # ...password field stays the same

Next, we will need to adjust our validate_email method since the email is no longer required:

   # ...fields are declared here

    def validate_email(self, value):
        # We can no longer assume that our value is never `None`.
        # If it is, then we simply skip the validation, assuming
        # the username was provided instead
        if value is None:
            return value

        # ...the rest of logic goes here

It’s also not a bad idea to add a custom validation method to check whether a user with the given username already exists:

    # ... `validate_email` declaration goes here

    def validate_username(self, value):
        # Skip validation if no value provided
        if value is None:
            return value

        if User.objects.exists(username=value):
            raise serializers.ValidationError(
                'A user with this username already exists.')

        return value
NOTE: In a real-world Django application, the username field of the User model has a unique constraint. This means that the uniqueness of the username is enforced on the database level. Furthermore, you'd be more likely to use a ModelSerializer class, which would pick up this constraint and perform necessary validation automatically. We will come back to this later in the article.

Finally, let’s add our validate method to ensure that either an email or a username is provided.

It’s important to keep in mind, that DRF has no way to determine which field we’re raising an error for when we do it inside the validate method. Therefore, we have to raise the error with a dictionary argument, where the key is the field name or a non-field errors key, and the value is the error message.

# Note: we import `api_settings` for the non-field errors key
from rest_framework.settings import api_settings
# ...other imports go here

# ...serializer class declaration goes here

    def validate(self, attrs):
        # Here we don't need to check whether a user with the given
        # email or username exists, as this would have already
        # been done by the one of our `validate_...` methods
        email, username = attrs.get('email', None), attrs.get('username', None)
        if email is None and username is None:
            # Here is how we raise an error with a dict value
            raise serializers.ValidationError({
                api_settings.NON_FIELD_ERRORS_KEY: 
                    'Either an email or a username must be provided.'
            })

        # If we reached this line, then at least one field was provided.
        # Since username is a non-nullable model field, we use the email
        # as a value for it, and vice versa.
        if username is None:
            attrs['username'] = email
        if email is None:
            attrs['email'] = username

        # Note: it's important to return attrs at the end of this method
        return attrs

It’s also worth noting that when any per-field validation fails, the per-request validation will not be run at all. This applies to both validation run by Field classes and the validate_field methods.

Creating model instances with Serializer

Now that we have our validation in place, we can move to the last step – creating the user account.

Serializers in DRF provide a special method for that purpose – create() – which we can override to specify the desired behaviour. It accepts two parameters: self and validated_data. As the name suggests, the latter will contain the data after the validation has been performed on it.

This method will only be run if the validation succeeded.

# ...our serializer declaration goes here

    def create(self, validated_data):
        # Note: it's important to return the created instance here
        # as it will be used by `serializer.data` in our view
        return User.objects.create(**validated_data)

Lastly, we need to slightly modify our view to accommodate the new logic. We will call the save() method on our serializer. This is a special method used for saving changes to the database. It is smart enough to know that we want to create a new instance, therefore it will call our create() method under the hood.

Make the following changes inside the views.py file:

# ...imports go here
# Note: we import HTTP status codes here
from rest_framework import status

class UserRegistrationAPIView(APIView):
    # ...policy declaration goes here

    def post(self, request, format=None):
        # ...serializer instantiated and is_valid() called here

        # Here we call our save() method
        serializer.save()

        # Let's update the response code to 201 to follow the standards
        return Response(serializer.data, status=status.HTTP_201_CREATED)

Bingo! Our endpoint should now not only validate data but also add new users to the database.

Updating model instances with Serializer

While not entirely relevant to our current scenario, it’s worth mentioning another method – update(). It is similar to create() but is used for updating existing records. It accepts three parameters: self, instance and validated_data, where the instance is a model instance.

With this method, we can expand our serializer to be more generic, so that it can be used, for example, for profile endpoints where the user can see and edit their details.

Let’s rename our serializer to just UserSerializer to make it sound more generic, and add the update() method to it:

# Note: we are renaming our serializer to `UserSerializer`
class UserSerializer(serializers.Serializer):
    # ...fields declaration, validation methods and create() go here

    def update(self, instance, validated_data):
        for field in validated_data:
            # This is a pythonic one-liner to update fields on the instance
            setattr(instance, field, validated_data[field])

        # Don't forget to save the instance to the database
        instance.save()
        # Like in `create()`, we must return the instance at the end
        return instance

We will also need to make a few changes to the views.py file. Firstly, we have to update the reference to our serializer class (since it’s been renamed).

# ...imports go here
# Update the name of our serializer
from .serializers import UserSerializer

class UserRegistrationAPIView(APIView):
    # ...policy declaration goes here

    def post(self, request, format=None):
        # Also update the serializer name during instantiation
        serializer = UserSerializer(data=request.data)
        # ...the rest of post() method goes here

And now let’s add the aforementioned profile endpoints. They will rely on a user being logged in. We will skip the implementation of the login endpoint to keep this article concise. You can look up the code for that in the series’ project repository on GitHub (link coming soon).

# ...user registration endpoint declaration goes here

class UserProfileAPIView(APIView):

    def get(self, request, format=None):
        serializer = UserSerializer(instance=request.user)
        return Response(serializer.data)

    def put(self, request, format=None):
       # Note how we pass the `instance` this time
       serializer = UserSerializer(instance=request.user, data=request.data)
       serializer.is_valid(raise_exceptions=True) # Validation

       # Note: we use the same `save()` method we used in the `post()` method
       # of the user registration view to create a new record. The `save()` 
       # method is able to determine that this time we want to update an
       # existing record, because we passed the `instance` during the
       # serializer instantiation above
       serializer.save()

       return Response(serializer.data)

Now let’s see how we can make use of DRF’s tools to make our code shorter and simpler.

What is ModelSerializer in DRF?

We had to write a decent amount of code to enable our serializer to work with the user model. Of course, any real-world project would have many more models than just users. Imagine writing all that code for each model every time!

Luckily, DRF comes with the ModelSerializer class to save us the effort. Under the hood, it behaves very similarly to what we manually programmed so far. But it requires significantly less code to set one up.

Let’s modify our serializer class to make use of the ModelSerializer‘s functionality. First of all, we need to update the inheritance:

class UserSerializer(serializers.ModelSerializer):
    # ...the rest of implementation goes here

Next, we need to redeclare our fields. This time, we don’t need to specify fields one by one. Instead, we will declare a nested class called Meta inside of our serializer. Inside the Meta class we can specify the model our serializer will work with, and which fields of that model it should handle. Here is how it’s done:

# ...imports go here

User = get_user_model()

class UserSerializer(serializers.ModelSerializer):
    # Note: we deleted all manual field declarations

    class Meta:
        model = User
        fields = ('email', 'username', 'password')

The ModelSerializer is intelligent enough to declare these fields automatically and connect them to the actual model fields when creating or updating model instances. Furthermore, it can even determine whether the field is required and what kind of validation it requires.

NOTE: Model fields declared without null=True and/or blank=True will be treated as required by the ModelSerializer. It will also inherit validators=[...] from the model field, including the custom ones you added manually.

If we wanted our serializer to accept all model fields, we could use the following special keyword instead:

class UserSerializer(serializers.ModelSerializer):

    class Meta:
        model = User
        fields = '__all__'

By default, the username field would be set as required, since it’s declared non-nullable on the model. To allow users to sign up with either an email or a username, we need to mark the username field optional. It’s very easy to do with the extra_kwargs={...} property of the Meta class.

class UserSerializer(serializers.ModelSerializer):

    class Meta:
        # ...model and fields are specified here
        extra_kwargs = {
            'username': { 'required': False },
        }

We no longer need to manually validate whether a user with the given username exists, as this will be checked automatically due to the unique=True constraint on the model field. Hence we can safely delete our validate_username method.

However, the default error message for that will read like “This field must be unique.” This is ok, but we would like to keep our original error message.

Luckily, it’s easily customizable. We need to add the error_messages key to our username dictionary under extra_kwargs. The value must be a dictionary, where the key is the error code (in our case “unique”) and the value is the error message.

Here’s what the customized error message looks like:

class UserSerializer(serializers.ModelSerializer):

    class Meta:
        # ...model and fields are declared here
        extra_kwargs = {
            'username': {
                'required': False,
                'error_messages': {
                    'unique': 'A user with this username already exists.'
                }
            },
        }

We still want to keep the validate_email method because the email field doesn’t have a unique constraint.

Our validate method can also stay the same since it checks whether at least one of the required fields is provided.

Lastly, we can remove the create() and update() methods, because ModelSerializer comes with built-in logic for those.

Our complete user serializer class now looks like this:

from django.contrib.auth import get_user_model
from rest_framework.settings import api_settings
from rest_framework import serializers

User = get_user_model() # Get reference to the model

class UserSerializer(serializers.ModelSerializer):

    class Meta:
        model = User
        fields = ('email', 'username', 'password')
        extra_kwargs = {
            'username': {
                'required': False,
                'error_messages': {
                    'unique': 'A user with this username already exists.'
                }
            },
        }

    def validate_email(self, value):
        # Skip validation if no value provided
        if value is None:
            return value

        if User.objects.exists(email=value):
            raise serializers.ValidationError(
                'A user with this email address already exists.')

        # Note: it's important to return the value at the end of this method
        return value

    def validate(self, attrs):
        # Here we don't need to check whether a user with the given
        # email or username exists, as this would have already
        # been done by the one of our `validate_...` methods
        email, username = attrs.get('email', None), attrs.get('username', None)
        if email is None and username is None:
            # Here is how we raise an error with a dict value
            raise serializers.ValidationError({
                api_settings.NON_FIELD_ERRORS_KEY: 
                    'Either an email or a username must be provided.'
            })

        # If we reached this line, then at least one field was provided.
        # Since username is a non-nullable model field, we use the email
        # as a value for it, and vice versa.
        if username is None:
            attrs['username'] = email
        if email is None:
            attrs['email'] = username

        # Note: it's important to return attrs at the end of this method
        return attrs

Pretty neat, isn’t it? And we don’t need to make any changes to the views.py file.

Conclusion

We’ve reviewed how the Serializer class works in Django REST Framework and how to make it work with a model. While it’s useful to know what’s happening under the hood, we wouldn’t want to write so much code every time we create a serializer. At the very least it’s not DRY.

For that matter, we also studied the ModelSerializer class which saves us a ton of effort when working with database models. We even looked into how to customize specific properties of fields without redeclaring them.

I hope that you found this article useful. I’d love to hear from you in the comments!

Leave a Reply

Your email address will not be published. Required fields are marked *

You May Also Like