Django REST Framework Serializer.

Content

In this article I covered:

Validating data at the field or object level
Customizing the serialization and deserialization output
Passing additional data at save
Passing context to serializers
Renaming serializer output fields
Attaching serializer function responses to data
Fetching data from one-to-one models
Attaching data to the serialized output
Creating separate read and write serializer
Setting read-only fields
Handling nested serialization

Data Validation

DRF enforces data validation in the deserialization process, which is why you need to call is_valid() before accessing the validated data. If the data is invalid, errors are then appended to the serializer's error property and a ValidationError is thrown.

There are two types of custom data validators:

Custom field
Object-level

Let's look at an example. Suppose we have a Movie model:

from django.db import models

class Movie(models.Model):
    title = models.CharField(max_length=128)
    description = models.TextField(max_length=2048)
    release_date = models.DateField()
    rating = models.PositiveSmallIntegerField()

    us_gross = models.IntegerField(default=0)
    worldwide_gross = models.IntegerField(default=0)

    def __str__(self):
        return f'{self.title}'

Our model has a title, description, release_date, rating, us_gross and worldwide_gross.

We also have a simple ModelSerializer, which serializes all the fields:

from rest_framework import serializers
from examples.models import Movie


class MovieSerializer(serializers.ModelSerializer):
    class Meta:
        model = Movie
        fields = '__all__'

Let's say the model is only valid if both of these are true:

rating is between 1 and 10
us_gross is less than worldwide_gross We can use custom data validators for this.

Custom field validation

Custom field validation allows us to validate a specific field. We can use it by adding the validate_<field_name> method to our serializer like so:

from rest_framework import serializers
from examples.models import Movie


class MovieSerializer(serializers.ModelSerializer):
    class Meta:
        model = Movie
        fields = '__all__'

    def validate_rating(self, value):
        if value < 1 or value > 10:
            raise serializers.ValidationError('Rating has to be between 1 and 10.')
        return value

Our validate_rating method will make sure the rating always stays between 1 and 10.

Object-level validation

Sometimes you'll have to compare fields with one another in order to validate them. This is when you should use the object-level validation approach.

Example:

from rest_framework import serializers
from examples.models import Movie


class MovieSerializer(serializers.ModelSerializer):
    class Meta:
        model = Movie
        fields = '__all__'

    def validate(self, data):
        if data['us_gross'] > data['worldwide_gross']:
            raise serializers.ValidationError('worldwide_gross cannot be bigger than us_gross')
        return data

The validate method will make sure us_gross is never bigger than worldwide_gross.

Functional validators

If we use the same validator in multiple serializers, we can create a function validator instead of writing the same code over and over again. Let's write a validator that checks if the number is between 1 and 10:

def is_rating(value):
    if value < 1:
        raise serializers.ValidationError('Value cannot be lower than 1.')
    elif value > 10:
        raise serializers.ValidationError('Value cannot be higher than 10')

We can now append it to our MovieSerializer like so:

from rest_framework import serializers
from examples.models import Movie


class MovieSerializer(serializers.ModelSerializer):
    rating = IntegerField(validators=[is_rating])

Custom Outputs

Two of the most useful functions inside the BaseSerializer class that we can override are to_representation() and to_internal_value(). By overriding them, we can change the serialization and deserialization behavior, respectively, to append additional data, extract data, and handle relationships.

to_representation() allows us to change the serialization output
to_internal_value() allows us to change the deserialization output

Suppose you have the following model:

from django.contrib.auth.models import User
from django.db import models


class Resource(models.Model):
    title = models.CharField(max_length=256)
    content = models.TextField()
    liked_by = models.ManyToManyField(to=User)

    def __str__(self):
        return f'{self.title}'

Every resource has a title, content, and liked_by field. liked_by represents the users that liked the resource.

Our serializer is defined like so:

from rest_framework import serializers
from examples.models import Resource


class ResourceSerializer(serializers.ModelSerializer):
    class Meta:
        model = Resource
        fields = '__all__'

If we serialize a resource and access its data property, we'll get the following output:

{
   "id": 1,
   "title": "C++ with examples",
   "content": "This is the resource's content.",
   "liked_by": [
      2,
      3
   ]
}

to_representation()

Now, let's say we want to add a total likes count to the serialized data. The easiest way to achieve this is by implementing the to_representation method in our serializer class:

from rest_framework import serializers
from examples.models import Resource


class ResourceSerializer(serializers.ModelSerializer):
    class Meta:
        model = Resource
        fields = '__all__'

    def to_representation(self, instance):
        representation = super().to_representation(instance)
        representation['likes'] = instance.liked_by.count()

        return representation

This piece of code fetches the current representation, appends likes to it, and returns it.

If we serialize another resource, we'll get the following result:

{
   "id": 1,
   "title": "C++ with examples",
   "content": "This is the resource's content.",
   "liked_by": [
      2,
      3
   ],
   "likes": 2
}

to_internal_value()

Suppose the services that use our API appends unnecessary data to the endpoint when creating resources:

{
   "info": {
       "extra": "data",
       ...
   },
   "resource": {
      "id": 1,
      "title": "C++ with examples",
      "content": "This is the resource's content.",
      "liked_by": [
         2,
         3
      ],
      "likes": 2
   }
}

If we try to serialize this data, our serializer will fail because it will be unable to extract the resource.

We can override to_internal_value() to extract the resource data:

from rest_framework import serializers
from examples.models import Resource


class ResourceSerializer(serializers.ModelSerializer):
    class Meta:
        model = Resource
        fields = '__all__'

    def to_internal_value(self, data):
        resource_data = data['resource']

        return super().to_internal_value(resource_data)

Yay! Our serializer now works as expected.

Serializer Save

Calling save() will either create a new instance or update an existing instance, depending on if an existing instance was passed when instantiating the serializer class:

# this creates a new instance
serializer = MySerializer(data=data)

# this updates an existing instance
serializer = MySerializer(instance, data=data)

Passing data directly to save

Sometimes you'll want to pass additional data at the point of saving the instance. This additional data might include information like the current user, the current time, or request data.

You can do so by including additional keyword arguments when calling save(). For example:

serializer.save(owner=request.user)

Keep in mind that values passed to save() won't be validated.

Serializer Context

There are some cases when you need to pass additional data to your serializers. You can do that by using the serializer context property. You can then use this data inside the serializer such as to_representation or when validating data.

You pass the data as a dictionary via the context keyword:

from rest_framework import serializers
from examples.models import Resource

resource = Resource.objects.get(id=1)
serializer = ResourceSerializer(resource, context={'key': 'value'})

Then you can fetch it inside the serializer class from the self.context dictionary like so:

from rest_framework import serializers
from examples.models import Resource


class ResourceSerializer(serializers.ModelSerializer):
    class Meta:
        model = Resource
        fields = '__all__'

    def to_representation(self, instance):
        representation = super().to_representation(instance)
        representation['key'] = self.context['key']

        return representation

Our serializer output will now contain key with value.

Source Keyword

The DRF serializer comes with the source keyword, which is extremely powerful and can be used in multiple case scenarios. We can use it to:

Rename serializer output fields
Attach serializer function response to data
Fetch data from one-to-one models

Let's say you're building a social network and every user has their own UserProfile, which has a one-to-one relationship with the User model:

from django.contrib.auth.models import User
from django.db import models


class UserProfile(models.Model):
    user = models.OneToOneField(to=User, on_delete=models.CASCADE)
    bio = models.TextField()
    birth_date = models.DateField()

    def __str__(self):
        return f'{self.user.username} profile'

We're using a ModelSerializer for serializing our users:

class UserSerializer(serializers.ModelSerializer):
    class Meta:
        model = User
        fields = ['id', 'username', 'email', 'is_staff', 'is_active']

Let's serialize a user:

{
   "id": 1,
   "username": "admin",
   "email": "admin@admin.com",
   "is_staff": true,
   "is_active": true
}

Rename serializer output fields

To rename a serializer output field we need to add a new field to our serializer and pass it to fieldsproperty.

class UserSerializer(serializers.ModelSerializer):
    active = serializers.BooleanField(source='is_active')

    class Meta:
        model = User
        fields = ['id', 'username', 'email', 'is_staff', 'active']

Our active field is now going to be named active instead of is_active.

Attach serializer function response to data

We can use source to add a field which equals to function's return.

class UserSerializer(serializers.ModelSerializer):
    full_name = serializers.CharField(source='get_full_name')

    class Meta:
        model = User
        fields = ['id', 'username', 'full_name', 'email', 'is_staff', 'active']

get_full_name() is a method from the Django user model that concatenates user.first_name and user.last_name. Our response will now contain full_name.

Append data from one-to-one models

Now let's suppose we also wanted to include our user's bio and birth_date in UserSerializer. We can do that by adding extra fields to our serializer with the source keyword.

Let's modify our serializer class:

class UserSerializer(serializers.ModelSerializer):
    bio = serializers.CharField(source='userprofile.bio')
    birth_date = serializers.DateField(source='userprofile.birth_date')

    class Meta:
        model = User
        fields = [
            'id', 'username', 'email', 'is_staff',
            'is_active', 'bio', 'birth_date'
        ]  # note we also added the new fields here

We can access userprofile.<field_name>, because it is a one-to-one relationship with our user.

This is our final JSON response:

{
   "id": 1,
   "username": "admin",
   "email": "",
   "is_staff": true,
   "is_active": true,
   "bio": "This is my bio.",
   "birth_date": "1995-04-27"
}

SerializerMethodField

SerializerMethodField is a read-only field, which gets its value by calling a method on the serializer class that it is attached to. It can be used to attach any kind of data to the serialized presentation of the object.

SerializerMethodField gets its data by calling get_<field_name>.

If we wanted to add a full_name attribute to our User serializer we could achieve that like this:

from django.contrib.auth.models import User
from rest_framework import serializers


class UserSerializer(serializers.ModelSerializer):
    full_name = serializers.SerializerMethodField()

    class Meta:
        model = User
        fields = '__all__'

    def get_full_name(self, obj):
        return f'{obj.first_name} {obj.last_name}'

This piece of code creates a user serializer that also contains full_name which is the result of the get_full_name() function.

Different Read and Write Serializers

If your serializers contain a lot of nested data, which is not required for write operations, you can boost your API performance by creating separate read and write serializers.

You do that by overriding the get_serializer_class() method in your ViewSet like so:

from rest_framework import viewsets

from .models import MyModel
from .serializers import MyModelWriteSerializer, MyModelReadSerializer


class MyViewSet(viewsets.ModelViewSet):
    queryset = MyModel.objects.all()

    def get_serializer_class(self):
        if self.action in ["create", "update", "partial_update", "destroy"]:
            return MyModelWriteSerializer

        return MyModelReadSerializer

This code checks what REST operation has been used and returns MyModelWriteSerializer for write operations and MyModelReadSerializer for read operations.

Read-only Fields

Serializer fields come with the read_only option. By setting it to True, DRF includes the field in the API output, but ignores it during create and update operations:

from rest_framework import serializers


class AccountSerializer(serializers.Serializer):
    id = IntegerField(label='ID', read_only=True)
    username = CharField(max_length=32, required=True)

Setting fields like id, create_date, etc. to read only will give you a performance boost during write operations.

If you want to set multiple fields to read_only, you can specify them using read_only_fields in Meta like so:

from rest_framework import serializers


class AccountSerializer(serializers.Serializer):
    id = IntegerField(label='ID')
    username = CharField(max_length=32, required=True)

    class Meta:
        read_only_fields = ['id', 'username']

Nested Serializers

There are two different ways of handling nested serialization with ModelSerializer:

Explicit definition
Using the depth field

Explicit definition

The explicit definition works by passing an external Serializer as a field to our main serializer.

Let's take a look at an example. We have a Comment which is defined like so:

from django.contrib.auth.models import User
from django.db import models


class Comment(models.Model):
    author = models.ForeignKey(to=User, on_delete=models.CASCADE)
    datetime = models.DateTimeField(auto_now_add=True)
    content = models.TextField()

Say you then have the following serializer:

from rest_framework import serializers


class CommentSerializer(serializers.ModelSerializer):
    author = UserSerializer()

    class Meta:
        model = Comment
        fields = '__all__'

If we serialize a Comment, you'll get the following output:

{
    "id": 1,
    "datetime": "2021-03-19T21:51:44.775609Z",
    "content": "This is an interesting message.",
    "author": 1
}

If we also wanted to serialize the user (instead of only showing their ID), we can add an author serializer field to our Comment:

from rest_framework import serializers


class UserSerializer(serializers.ModelSerializer):
    class Meta:
        model = User
        fields = ['id', 'username']


class CommentSerializer(serializers.ModelSerializer):
    author = UserSerializer()

    class Meta:
        model = Comment
        fields = '__all__'

Serialize again and you'll get this:

{
    "id": 1,
    "author": {
        "id": 1,
        "username": "admin"
    },
    "datetime": "2021-03-19T21:51:44.775609Z",
    "content": "This is an interesting message."
}

Using the depth field

When it comes to nested serialization, the depth field is one of the most powerful featuress. Let's suppose we have three models -- ModelA, ModelB, and ModelC. ModelA depends on ModelB while ModelB depends on ModelC. They are defined like so:

from django.db import models


class ModelC(models.Model):
    content = models.CharField(max_length=128)


class ModelB(models.Model):
    model_c = models.ForeignKey(to=ModelC, on_delete=models.CASCADE)
    content = models.CharField(max_length=128)


class ModelA(models.Model):
    model_b = models.ForeignKey(to=ModelB, on_delete=models.CASCADE)
    content = models.CharField(max_length=128)

Our ModelA serializer, which is the top-level object, looks like this:

from rest_framework import serializers


class ModelASerializer(serializers.ModelSerializer):
    class Meta:
        model = ModelA
        fields = '__all__'

If we serialize an example object we'll get the following output:

{
    "id": 1,
    "content": "A content",
    "model_b": 1
}

Now let's say we also want to include ModelB's content when serializing ModelA. We could add the explicit definition to our ModelASerializer or use the depth field.

When we change depth to 1 in our serializer like so:

from rest_framework import serializers


class ModelASerializer(serializers.ModelSerializer):
    class Meta:
        model = ModelA
        fields = '__all__'
        depth = 1

The output changes to the following:

{
    "id": 1,
    "content": "A content",
    "model_b": {
        "id": 1,
        "content": "B content",
        "model_c": 1
    }
}

If we change it to 2 our serializer will serialize a level deeper:

{
    "id": 1,
    "content": "A content",
    "model_b": {
        "id": 1,
        "content": "B content",
        "model_c": {
            "id": 1,
            "content": "C content"
        }
    }
}

The downside is that you have no control over a child's serialization. Using depth will include all fields on the children, in other words.

Conclusion

In this article, you learned a number of different ways and method for using DRF serializers.