This article takes a close look at serializers in the Django REST Framework and explains how they work. It’s the second part of the comprehensive guides series. Here is a quick summary of the whole series:
- The first part covers the
APIView
class – the foundation of API endpoints in the Django REST Framework. - The second part talks about the
Serializer
andModelSerializer
classes, and explains their purpose and use. - The third part covers
GenericAPIView
and its sibling classes designed for working with database models. These are the classes you’re going to use most commonly. - The fourth part talks about
ViewSet
class and its subclasses, which allow you to create an entire CRUD set of endpoints with just a few lines of code. - The fifth part reverse-engineers the internal mechanics of Django REST Framework and explains what exactly happens during a request.
Pre-requisites and assumptions
- You have a basic understanding of RESTful APIs.
- You have a basic knowledge of the Django framework.
- You have Python 3.6+ installed on your system.
- You have
pip
installed and ready to use. Optionally, you havevirtualenv
,venv
orpipenv
(my favourite) on your system, and you run all commands within a virtual environment (it’s good practice). - You’re continuing the project from the previous part of the series or you created a new empty Django project (feel free to use this cheat sheet as help).
What is Serializer in DRF?
Let’s begin with establishing the concept of serialization in programming. Here is what Wikipedia tells us about it:
In computing, serialization is the process of translating a data structure or object state into a format that can be stored (for example, in a file or memory data buffer) or transmitted (for example, over a computer network) and reconstructed later (possibly in a different computer environment).
To put it simply, to serialize data means to convert it into an intermediary format that can be easily understood by other programs and reconstructed into its original form.
In the context of the web, the most commonly used intermediary data formats are JSON and XML. So it’s easy to assume that DRF serializers convert data to and from these formats. However, this is not exactly the case.
Instead, serializers in DRF work with Python’s native data types. They transform model objects and querysets into a “simplified” representation consisting of strings, integers, booleans, dictionaries, and lists.
This approach allows serializers to remain independent of specific data formats and keep their implementation universal.
TIP: Django REST Framework has special families of classes for converting data to and from intermediary formats. Those are parsers and renderers. We will cover when and how exactly DRF transforms data to and from HTTP-friendly formats like JSON and XML in the last part of this series.
On top of converting data to native data types, the serializers in DRF have other responsibilities: validating user-submitted data and working with database models.
As you will see further in this article, serializers are essential to the Django REST Framework.
TAKEAWAY: Serializers in DRF are responsible for the following: 1. Converting model instances and querysets into Python's native data types. 2. Validating user-submitted data. 3. Creating and updating database models instances.
How to use DRF Serializers?
If you ever worked with the Form
class in Django, you will find DRF serializers familiar. In fact, their architecture is based on Django forms.
Let’s study how serializers work by building a simple API endpoint that allows users to register an account.
To define what kind of data our endpoint will accept and return, we will need to create a serializer class with respective fields. Create a new file called serializers.py
and add the following code to it:
# We import the entire module to have easy access to multiple things
from rest_framework import serializers
# Note how we inherit from the Serializer class
class UserRegistrationSerializer(serializers.Serializer):
# The fields here define what data our endpoint will work with
email = serializers.EmailField(required=True)
password = serializers.CharField(required=True)
What we did is declared a couple of properties under our class and assigned Field
instances to them.
Note that we used different Field
classes for each field. For example, for the email field, we used the EmailField
class. As the name suggests, this field is designed to work specifically with email addresses and has built-in validation to determine whether a given string is an email address.
TIP: You can find the full list of available serializer fields in the official documentation.
We also specified the required=True
parameter on our fields. This means that the user must send those fields, otherwise, the endpoint will return an error.
When inspected closer, the code snippet above reveals an interesting fact – some part of the validation is done by the Field
classes directly. This means that the Serializer
class is not responsible for all validation, but delegates some of it to the fields.
But let’s continue. Our serializer by itself is not very useful. We need a way to forward user-submitted data to it. This is what APIViews are responsible for as described in the previous part of this series.
Let’s declare one. Open or create views.py
and add the following lines to it:
from rest_framework.views import APIView
from rest_framework.response import Response
from rest_framework.permissions import AllowAny
# We import our serializer here
from .serializers import UserRegistrationSerializer
class UserRegistrationAPIView(APIView):
# Note: we have to specify the following policy to allow
# anonymous users to call this endpoint
permission_classes = [AllowAny]
def post(self, request, format=None):
# Pass user-submitted data to the serializer
serializer = UserRegistrationSerializer(data=request.data)
# Next, we trigger validation with `raise_exceptions=True`
# which will abort the request and return user-friendly
# error messages if the validation fails
serializer.is_valid(raise_exceptions=True)
# For now we skip any interactions with the database
# and simply show the validated data back to the user
return Response(serializer.data)
Lastly, we are going to register our endpoint in the urls.py
file. Here is how we do it:
from django.urls import path
# Import our view class here
from .views import UserRegistrationAPIView
urlpatterns = [
# ...other routes are declared here
path('api/users/register/',
UserRegistrationAPIView.as_view(),
name='user-register'),
]
After restarting the Django server, we can open our endpoint in the Browsable API and submit a test request with the following data:
{
"email": "wizard@sourcery.blog",
"password": "TopSecret!"
}
If all is well, you should get a response with 200 OK
code and with the same data that we submitted.
But not all requests will contain valid data. Let’s now look into the validation process.
How does the validation work?
DRF provides a lot of flexibility when it comes to validating user-submitted data. There are three main layers of validation in DRF:
- Per-field validation run by the
Field
classes, which we briefly touched upon earlier. - Per-field validation run by the
Serializer
class – more on that soon. - Per-request validation run by the
Serializer
class. This one allows us to check multiple fields at once.
Let’s review each layer individually.
Per-field validation run by Field classes
You can try sending an empty request to our endpoint via the Browsable API. What you should see is the error messages denoting that our email and password fields are mandatory:
{
"email": [
"This field is required."
],
"password": [
"This field is required."
]
}
Similarly, if you try to submit a faulty email address, you will get an error message detailing that too.
This is nice, but sometimes you want to add a custom validation rule that DRF doesn’t provide out of the box. For example, you might need to validate a phone number.
If this case you can create a custom validator function and pass it via the validators=[]
parameter when declaring a field. Here’s how it would look:
class UserRegistrationSerializer(serializers.Serializer):
# ...email and password fields are declared here
phone = serializers.CharField(validators=[my_phone_validator_func])
The beauty of this approach is that you can reuse your validator function in other serializers as many times as you want.
We will leave out the implementation of the phone validator to keep this article concise. You can find more information on validators in the official documentation.
Per-field validation run by Serializer class
In some situations, you might want to have a field-specific validation just on one of your endpoints. For example, our user registration endpoint needs to check whether a user account with a submitted email already exists. Creating a custom validator for this might be an overkill.
Luckily, we can declare a method on our serializer class that starts with validate_
followed by the name of the field. For example validate_email
. DRF will pick up that method automatically and run it as part of validation.
Such methods always accept two parameters: self
and value
. The latter is the field value submitted during the request.
Here is what the implementation of our custom validation method for the email field looks like:
# Here we use a special method instead of importing the `User` model
# directly. This will ensure that our code works even if our project
# were to switch to a custom user model.
from django.contrib.auth import get_user_model
User = get_user_model() # Get reference to the user model here
class UserRegistrationSerializer(serializers.Serializer):
# ...fields are declared here, minus the phone field
def validate_email(self, value):
# It's safe to assume that the value will never
# be `None`, because our field is required
if User.objects.exists(email=value):
raise serializers.ValidationError(
'A user with this email address already exists.')
# Note: it's important to return the value at the end of this method
return value
DRF is smart enough to know that the raised error relates to the email field, therefore the error message will be properly namespaced:
{
"email": [
"A user with this email address already exists."
]
}
Per-request validation run by Serializer class
Sometimes you need to check multiple fields at once to determine whether a request is valid.
For that, DRF lets us declare a method called validate
on our serializer class. This method will always accept two parameters: self
and attrs
. The latter will contain all submitted (and pre-validated by per-field validators) data.
Let’s modify our user registration serializer to allow for a new scenario. We will let our users sign up with either an email or a username. They can optionally provide both, but at least one must be always provided.
The following several snapshots are showing the changes we’re making. First of all, we change the accepted fields:
class UserRegistrationSerializer(serializers.Serializer):
email = serializers.EmailField() # Removed `required=True` here
username = serializers.CharField() # Added a new field
# ...password field stays the same
Next, we will need to adjust our validate_email
method since the email is no longer required:
# ...fields are declared here
def validate_email(self, value):
# We can no longer assume that our value is never `None`.
# If it is, then we simply skip the validation, assuming
# the username was provided instead
if value is None:
return value
# ...the rest of logic goes here
It’s also not a bad idea to add a custom validation method to check whether a user with the given username already exists:
# ... `validate_email` declaration goes here
def validate_username(self, value):
# Skip validation if no value provided
if value is None:
return value
if User.objects.exists(username=value):
raise serializers.ValidationError(
'A user with this username already exists.')
return value
NOTE: In a real-world Django application, theusername
field of theUser
model has aunique
constraint. This means that the uniqueness of the username is enforced on the database level. Furthermore, you'd be more likely to use aModelSerializer
class, which would pick up this constraint and perform necessary validation automatically. We will come back to this later in the article.
Finally, let’s add our validate
method to ensure that either an email or a username is provided.
It’s important to keep in mind, that DRF has no way to determine which field we’re raising an error for when we do it inside the validate
method. Therefore, we have to raise the error with a dictionary argument, where the key is the field name or a non-field errors key, and the value is the error message.
# Note: we import `api_settings` for the non-field errors key
from rest_framework.settings import api_settings
# ...other imports go here
# ...serializer class declaration goes here
def validate(self, attrs):
# Here we don't need to check whether a user with the given
# email or username exists, as this would have already
# been done by the one of our `validate_...` methods
email, username = attrs.get('email', None), attrs.get('username', None)
if email is None and username is None:
# Here is how we raise an error with a dict value
raise serializers.ValidationError({
api_settings.NON_FIELD_ERRORS_KEY:
'Either an email or a username must be provided.'
})
# If we reached this line, then at least one field was provided.
# Since username is a non-nullable model field, we use the email
# as a value for it, and vice versa.
if username is None:
attrs['username'] = email
if email is None:
attrs['email'] = username
# Note: it's important to return attrs at the end of this method
return attrs
It’s also worth noting that when any per-field validation fails, the per-request validation will not be run at all. This applies to both validation run by Field
classes and the validate_field
methods.
Creating model instances with Serializer
Now that we have our validation in place, we can move to the last step – creating the user account.
Serializers in DRF provide a special method for that purpose – create()
– which we can override to specify the desired behaviour. It accepts two parameters: self
and validated_data
. As the name suggests, the latter will contain the data after the validation has been performed on it.
This method will only be run if the validation succeeded.
# ...our serializer declaration goes here
def create(self, validated_data):
# Note: it's important to return the created instance here
# as it will be used by `serializer.data` in our view
return User.objects.create(**validated_data)
Lastly, we need to slightly modify our view to accommodate the new logic. We will call the save()
method on our serializer. This is a special method used for saving changes to the database. It is smart enough to know that we want to create a new instance, therefore it will call our create()
method under the hood.
Make the following changes inside the views.py
file:
# ...imports go here
# Note: we import HTTP status codes here
from rest_framework import status
class UserRegistrationAPIView(APIView):
# ...policy declaration goes here
def post(self, request, format=None):
# ...serializer instantiated and is_valid() called here
# Here we call our save() method
serializer.save()
# Let's update the response code to 201 to follow the standards
return Response(serializer.data, status=status.HTTP_201_CREATED)
Bingo! Our endpoint should now not only validate data but also add new users to the database.
Updating model instances with Serializer
While not entirely relevant to our current scenario, it’s worth mentioning another method – update()
. It is similar to create()
but is used for updating existing records. It accepts three parameters: self
, instance
and validated_data
, where the instance is a model instance.
With this method, we can expand our serializer to be more generic, so that it can be used, for example, for profile endpoints where the user can see and edit their details.
Let’s rename our serializer to just UserSerializer
to make it sound more generic, and add the update()
method to it:
# Note: we are renaming our serializer to `UserSerializer`
class UserSerializer(serializers.Serializer):
# ...fields declaration, validation methods and create() go here
def update(self, instance, validated_data):
for field in validated_data:
# This is a pythonic one-liner to update fields on the instance
setattr(instance, field, validated_data[field])
# Don't forget to save the instance to the database
instance.save()
# Like in `create()`, we must return the instance at the end
return instance
We will also need to make a few changes to the views.py
file. Firstly, we have to update the reference to our serializer class (since it’s been renamed).
# ...imports go here
# Update the name of our serializer
from .serializers import UserSerializer
class UserRegistrationAPIView(APIView):
# ...policy declaration goes here
def post(self, request, format=None):
# Also update the serializer name during instantiation
serializer = UserSerializer(data=request.data)
# ...the rest of post() method goes here
And now let’s add the aforementioned profile endpoints. They will rely on a user being logged in. We will skip the implementation of the login endpoint to keep this article concise. You can look up the code for that in the series’ project repository on GitHub (link coming soon).
# ...user registration endpoint declaration goes here
class UserProfileAPIView(APIView):
def get(self, request, format=None):
serializer = UserSerializer(instance=request.user)
return Response(serializer.data)
def put(self, request, format=None):
# Note how we pass the `instance` this time
serializer = UserSerializer(instance=request.user, data=request.data)
serializer.is_valid(raise_exceptions=True) # Validation
# Note: we use the same `save()` method we used in the `post()` method
# of the user registration view to create a new record. The `save()`
# method is able to determine that this time we want to update an
# existing record, because we passed the `instance` during the
# serializer instantiation above
serializer.save()
return Response(serializer.data)
Now let’s see how we can make use of DRF’s tools to make our code shorter and simpler.
What is ModelSerializer in DRF?
We had to write a decent amount of code to enable our serializer to work with the user model. Of course, any real-world project would have many more models than just users. Imagine writing all that code for each model every time!
Luckily, DRF comes with the ModelSerializer
class to save us the effort. Under the hood, it behaves very similarly to what we manually programmed so far. But it requires significantly less code to set one up.
Let’s modify our serializer class to make use of the ModelSerializer
‘s functionality. First of all, we need to update the inheritance:
class UserSerializer(serializers.ModelSerializer):
# ...the rest of implementation goes here
Next, we need to redeclare our fields. This time, we don’t need to specify fields one by one. Instead, we will declare a nested class called Meta
inside of our serializer. Inside the Meta
class we can specify the model our serializer will work with, and which fields of that model it should handle. Here is how it’s done:
# ...imports go here
User = get_user_model()
class UserSerializer(serializers.ModelSerializer):
# Note: we deleted all manual field declarations
class Meta:
model = User
fields = ('email', 'username', 'password')
The ModelSerializer
is intelligent enough to declare these fields automatically and connect them to the actual model fields when creating or updating model instances. Furthermore, it can even determine whether the field is required and what kind of validation it requires.
NOTE: Model fields declared withoutnull=True
and/orblank=True
will be treated as required by theModelSerializer
. It will also inheritvalidators=[...]
from the model field, including the custom ones you added manually.
If we wanted our serializer to accept all model fields, we could use the following special keyword instead:
class UserSerializer(serializers.ModelSerializer):
class Meta:
model = User
fields = '__all__'
By default, the username field would be set as required, since it’s declared non-nullable on the model. To allow users to sign up with either an email or a username, we need to mark the username field optional. It’s very easy to do with the extra_kwargs={...}
property of the Meta
class.
class UserSerializer(serializers.ModelSerializer):
class Meta:
# ...model and fields are specified here
extra_kwargs = {
'username': { 'required': False },
}
We no longer need to manually validate whether a user with the given username exists, as this will be checked automatically due to the unique=True
constraint on the model field. Hence we can safely delete our validate_username
method.
However, the default error message for that will read like “This field must be unique.” This is ok, but we would like to keep our original error message.
Luckily, it’s easily customizable. We need to add the error_messages
key to our username
dictionary under extra_kwargs
. The value must be a dictionary, where the key is the error code (in our case “unique”) and the value is the error message.
Here’s what the customized error message looks like:
class UserSerializer(serializers.ModelSerializer):
class Meta:
# ...model and fields are declared here
extra_kwargs = {
'username': {
'required': False,
'error_messages': {
'unique': 'A user with this username already exists.'
}
},
}
We still want to keep the validate_email
method because the email field doesn’t have a unique constraint.
Our validate
method can also stay the same since it checks whether at least one of the required fields is provided.
Lastly, we can remove the create()
and update()
methods, because ModelSerializer
comes with built-in logic for those.
Our complete user serializer class now looks like this:
from django.contrib.auth import get_user_model
from rest_framework.settings import api_settings
from rest_framework import serializers
User = get_user_model() # Get reference to the model
class UserSerializer(serializers.ModelSerializer):
class Meta:
model = User
fields = ('email', 'username', 'password')
extra_kwargs = {
'username': {
'required': False,
'error_messages': {
'unique': 'A user with this username already exists.'
}
},
}
def validate_email(self, value):
# Skip validation if no value provided
if value is None:
return value
if User.objects.exists(email=value):
raise serializers.ValidationError(
'A user with this email address already exists.')
# Note: it's important to return the value at the end of this method
return value
def validate(self, attrs):
# Here we don't need to check whether a user with the given
# email or username exists, as this would have already
# been done by the one of our `validate_...` methods
email, username = attrs.get('email', None), attrs.get('username', None)
if email is None and username is None:
# Here is how we raise an error with a dict value
raise serializers.ValidationError({
api_settings.NON_FIELD_ERRORS_KEY:
'Either an email or a username must be provided.'
})
# If we reached this line, then at least one field was provided.
# Since username is a non-nullable model field, we use the email
# as a value for it, and vice versa.
if username is None:
attrs['username'] = email
if email is None:
attrs['email'] = username
# Note: it's important to return attrs at the end of this method
return attrs
Pretty neat, isn’t it? And we don’t need to make any changes to the views.py
file.
Conclusion
We’ve reviewed how the Serializer
class works in Django REST Framework and how to make it work with a model. While it’s useful to know what’s happening under the hood, we wouldn’t want to write so much code every time we create a serializer. At the very least it’s not DRY.
For that matter, we also studied the ModelSerializer
class which saves us a ton of effort when working with database models. We even looked into how to customize specific properties of fields without redeclaring them.
I hope that you found this article useful. I’d love to hear from you in the comments!
< Part 1 – APIViews | Part 3 – GenericAPIViews > |