Django Update 1.6 to 1.9 – 1.8 to 1.9

Upgrading DJango from 1.8 to 1.9 was relatively easier as the main pain of upgrading DRF was all dealt with when I was upgrading Django from 1.6 to 1.8, which has been discussed here. A lot of libraries has to be updated which I came to know as and when I tried running the application.

One of the main issues that I had while the whole upgrade was creating the migration from scratch. The project had South for managing the migration, but since Django has now its inbuilt support, I removed this South and all the existing migration files. There were some circular dependency issues when creating the new migration. You may get an error as below.

django.db.migrations.graph.CircularDependencyError: partner.0001_initial, address.0001_initial, users.0001_initial

The reason was with some of the foreign keys. Django internally creates a graph data structure to figure out this dependency. You can get rid of it by removing the foreign keys temporarily, running the migrations and then creating the keys again.

These following libraries had to go for an update, mentioned with the version that I am using with the Django version 1.9:

Django Reversion 1.10.0, Django tables 2 1.0.5, Django Mptt 0.8.0,  Django Celery 3.2.1, Django Extensions 1.7.3, Django Haystack 2.5.1, Django Redis Cache 1.7.1, Django Redis Session 0.5.6

There are setting changes around the library – Pipeline. You can find the changes to be done on the library’s documentation page. The new settings are as such:

"STATICFILES_FINDERS - 'pipeline.finders.PipelineFinder', 
 PIPELINE = {
    'PIPELINE_ENABLED': True,
    'JAVASCRIPT': {
        'stats': {
            'source_filenames': (
              'js/jquery.js',
              'js/d3.js',
              'js/collections/*.js',
              'js/application.js',
            ),
            'output_filename': 'js/stats.js',
        }
    }
}"

Other than this, there were some code changes which were introduced in Django 1.9, for which I had to change some import statements. Some frequent Django and other libraries’ issues:

Issue:

from django.db.models.loading import get_models
ImportError: No module named loading

Solution:

from django.apps import apps

Issue:

from django.utils.importlib import import_module
ImportError: No module named importlib

Solution:

from importlib import import_module

Issue:

class ProductAdmin(reversion.VersionAdmin):
AttributeError: 'module' object has no attribute 'VersionAdmin'

Solution:

from reversion.admin import VersionAdmin

Issue:

@reversion.register
NameError: name 'reversion' is not defined

Solution:

from reversion import revisions

Issue:

django.utils.log.NullHandler class not found

Solution:

Use logging.NullHandler

This is pretty much it. Django upgrade is more of a task which needs a lot of patience than other things. Happy upgrading.

Django Update 1.6 to 1.9 – 1.6 to 1.8 and DRF Upgrade

This is going to be a series of posts describing briefly about something which turned my life upside down for few weeks. I am still in the process, as of now. Django upgrade teaches you a lot of things and when you are updating from a very old version to relatively newer one, it kills you, literally. You are up for an extremely bumpy ride that you will remember for a very very long time.

The first and the most difficult part is to upgrade the Django Rest Framework. I did not think this one was going to be such a biggie. We were on DRF 2.3.13 and without thinking a lot I took it to 3.4.0 and that was the first mistake that I did. All the test cases failed miserably and I started fixing them one by one. Some of them seemed self-explanatory and there were some issues which even Google failed to help. I struggled with these errors and they wasted 2 days. When nothing moved, I decided to move it to 3.0.5 first and then taking it to the other versions.

LESSON LEARNT – When upgrading a library, take it to just one major version above it and we must read the changelog carefully.

Upgrading to DRF 3.0.5 was relatively easier. There were some serializer issues which were mostly straightforward. And there were few deprecations as well. Then I took it to 3.1.1 which did not throw any extra error. Then upgraded to 3.2.1 and then to 3.3.1 and finally to 3.4.5. Meanwhile, I had upgraded Django to 1.8 which I shall discuss in the next post.

Some of the errors which frequently showed up are as below.

NotImplementedError: `request.QUERY_PARAMS` has been deprecated in favor of `request.query_params` since version 3.0, 
and has been fully removed as of version 3.2.

For the above one change QUERY_PARAMS to query_params. Simple googling can help here.

NotImplementedError: Field.to_representation() must be implemented for field id. 
If you do not need to support write operations you probably want to subclass `ReadOnlyField` instead.

This is one of the basic changes that you have to do. This one is basically asking you to change to:

order_id = serializers.Field(source='pk') # older
order_id = serializers.ReadOnlyField(source='pk') # newer
AssertionError: It is redundant to specify `source='discount_amount'` on field 'Field' in serializer 'OrderListSerializer', 
because it is the same as the field name. Remove the `source` keyword argument.

This error comes when in any of the fields of the serializer, you have the field name same as the source name.

discount_amount = serializers.ReadOnlyField(source='discount_amount') # older
discounted_amount = serializers.ReadOnlyField(source='discount_amount') # newer

Also, please take those serializers very seriously which take a list to serialize. The parameter many=True solves so many issues around them.

Some of the other issue which appeared, and I am sorry to have forgotten how they were fixed, are as below. I am sure, if you are this far, you will find them very easy to fix.

AttributeError: Got AttributeError when attempting to get a value for field `product` on serializer `ProductListSerializer`.
The serializer field might be named incorrectly and not match any attribute or key on the `RelatedManager` instance. 
Original exception text was: 'RelatedManager' object has no attribute 'product'
AttributeError: 'RelatedManager' object has no attribute
AttributeError: 'QuerySet' object has no attribute 'user'

Other changes include:

  • The earlier implementation where you had to define one custom to_native method has been changed to to_representation
class FinalRecordField(serializers.WritableField):

    def to_native(self, value):
        return value

to

class FinalStockRecordField(serializers.Serializer):

    def to_representation(self, instance):
        return instance
  • The library djangorestframework-xml had to be included as the default XML renderer was not working. I used version 1.3.0.

This is what I mostly remember. Try to be patient with the update as it has to be a disturbing thing. I am happily on DRF version 3.4.5. Next article will be on Django upgrade from 1.6 to 1.8.

Django Cache Busting

Browsers cache images, stylesheets, javascript, it’s their default nature. They do not want to fetch the same file again and again from the server as they are smart. But sometimes, when in your app you have changes in any of the javascripts, this feature can bite you in your back. You have made some changes in your js file but it’s not reflecting in the browser. Clear the browser cache and it works. Ask your clients to do this and they will be super furious. So, what are the options here?

Browsers are forced to fetch the latest file from the server when there is a name change in the source file. Versioning the file and tagging with a new version every time a change has been done works just fine.  By adding a new version, you ask the browser to fetch the new file. Something as adding this one

?version=0.0.1 # current version

But there is one pain here – changing the version each time you have done some changes in the js file. You can easily forget to change the version and the changes in the js files do not reflect.  There are ways to accomplish this thing. I tried implementing the same taking help from one the GitHub projects.

The idea is following:

  • Write something which overwrites the existing static tag, so that you serve the files the way you are handling their names. Get an idea about custom templates and tags  from here: Custom template tags and filters
  • The render function of the custom tag class is the place where you need to add the implementation. The most general of all would be adding the current timestamp with the filename, so that whenever(if) the file is changed, it will be tagged with the timestamp of its last edit.
  • Use the custom tag instead of the default staticfiles tag.

Here goes the implementation part. The custom tag class is as below.

from django import template
from django.conf import settings

import posixpath
import datetime
import urllib
import os

try:
    from django.contrib.staticfiles import finders
except ImportError:
    finders = None

register = template.Library()


@register.tag('static')
def do_static(parser, token):
    """
    overwriting the default tag here
    """
    return CacheBusterTag(token, False)


class CacheBusterTag(template.Node):
    def __init__(self, token, is_media):
        self.is_media = is_media

        try:
            tokens = token.split_contents()
        except ValueError:
            raise template.TemplateSyntaxError, "'%r' tag must have one or two arguments" % token.contents.split()[0]

        self.path = tokens[1]
        self.force_timestamp = len(tokens) == 3 and tokens[2] or False

    def render(self, context):
        """
        rendering the url with modification
        """
        try:
            path = template.Variable(self.path).resolve(context)
        except template.VariableDoesNotExist:
            path = self.path

        path = posixpath.normpath(urllib.unquote(path)).lstrip('/')
        url_prepend = getattr(settings, "STATIC_URL", settings.MEDIA_URL)

        if settings.DEBUG and finders:
            absolute_path = finders.find(path)
        else:
            absolute_path = os.path.join(getattr(settings, 'STATIC_ROOT', settings.MEDIA_ROOT), path)

        unique_string = self.get_file_modified(absolute_path)
        return url_prepend + path + '?' + unique_string

    @staticmethod
    def get_file_modified(path):
        """
        get the last modified time of the file
        """
        try:
            return datetime.datetime.fromtimestamp(os.path.getmtime(os.path.abspath(path))).strftime('%S%M%H%d%m%y')
        except Exception as e:
            return '000000000000'

Now comes the part where we serve the view.

from django.http import Http404
from django.contrib.staticfiles.views import serve as django_staticfiles_serve

"""
Views and functions for serving static files.
"""

def static_serve(request, path, document_root=None):
    try:
        return django_staticfiles_serve(request, path, document_root)
    except Http404:
        unique_string, new_path = path.split("/", 1)
        return django_staticfiles_serve(request, new_path, document_root)

In the base URLs class where you call the  Django’s default serve method, change that one to the following:

urlpatterns = patterns('',
                       url(r'^static/(?P.*)$', 'cachebuster.views.static_serve', {'document_root': STATIC_URL}),
                       .....
)

Now call the custom tag to load the static files.

{% load cachebuster %}

That’s it. This works for me. So, cheers.

Django Api Throttling

There are cases when you do not want your clients to bombard some apis. Django Rest Framework gives you an out of box support for controlling how many times your apis can be hit. It gives you options to control the number of hits per second, per minute, per hour and per day, exceeding which the client will get a status of 429. For storing the count, the framework uses the default caches set for the application.

CACHES = {
    "default": {
        "BACKEND": "redis_cache.cache.RedisCache",
        "LOCATION": "redis.cache.amazonaws.com:6379",
        "OPTIONS": {
            "DB": 0,
            "CLIENT_CLASS": "redis_cache.client.DefaultClient",
        }
    }
}

Your MIDDLEWARE_CLASSES in the settings.py look like this:

MIDDLEWARE_CLASSES = (
    '.......'
    'custom.throttling.ThrottleMiddleWare', # the custom class to control throttling limits
)

In the REST_FRAMEWORK settings in settings.py, we need to mention the counts and the classes to help with throttling. DRF gives you default implmentaion, but you write your own throttling as well. If you have to use the default classes :

REST_FRAMEWORK = {
    'DEFAULT_THROTTLE_CLASSES': (
        'custom.throttling.PerMinuteThrottle', # custom throttle [implemented below]
        # 'rest_framework.throttling.AnonRateThrottle',
        # 'rest_framework.throttling.UserRateThrottle'
    ),
    'DEFAULT_THROTTLE_RATES': {
        'per_minute': '256/min',
    }
}

The throttle class implemented below does a per minute throttling. You can implement similar other classes to fit your usecase.

from rest_framework.settings import APISettings, USER_SETTINGS, DEFAULTS, IMPORT_STRINGS
from rest_framework.throttling import UserRateThrottle

api_settings = APISettings(USER_SETTINGS, DEFAULTS, IMPORT_STRINGS)

class ThrottleMiddleWare(object):
    def process_response(self, request, response):
        """
        Setting the standard rate limit headers
        :param request:
        :param response:
        :return:
        """
        response['X-RateLimit-Limit'] = api_settings.DEFAULT_THROTTLE_RATES.get('per_minute', "None")
        if 'HIT_COUNT' in request.META:
            response['X-RateLimit-Remaining '] = self.parse_rate((api_settings.DEFAULT_THROTTLE_RATES.get(
                'per_minute'))) - request.META['HIT_COUNT']
        return response

    def parse_rate(self, rate):
        """
        Given the request rate string, return a two tuple of:
        , 
        """
        num_requests = 0
        try:
            if rate is None:
                return (None, None)
            num, period = rate.split('/')
            num_requests = int(num)
        except Exception:
            pass
        return num_requests

REQUEST_METHOD_GET, REQUEST_METHOD_POST = 'GET', 'POST'

class PerMinuteThrottle(UserRateThrottle):
    scope = 'per_minute'

    def allow_request(self, request, view):
        """
        Custom implementation:
        Implement the check to see if the request should be throttled.
        On success calls `throttle_success`.
        On failure calls `throttle_failure`.
        """
        hit_count = 0

        try:
            if request.user.is_authenticated():
                user_id = request.user.pk
            else:
                user_id = self.get_ident(request)
            request.META['USER_ID'] = user_id

            if str(request.method).upper() == REQUEST_METHOD_POST:
                return True

            if self.rate is None:
                return True

            self.key = self.get_cache_key(request, view)
            if self.key is None:
                return True

            self.history = self.cache.get(self.key, [])
            self.now = self.timer()

            # Drop any requests from the history which have now passed the
            # throttle duration

            duration = self.now - self.duration
            while self.history and self.history[-1] <= duration:
                self.history.pop()
            
            hit_count = len(self.history) 
            request.META['HIT_COUNT'] = hit_count + 1   
            if len(self.history) >= self.num_requests: 
                 request.META['HIT_COUNT'] = hit_count
                 return self.throttle_failure()
                 return self.throttle_success()
             except Exception:
                 pass

        # in case any exception occurs - we must allow the request to go through
        request.META['HIT_COUNT'] = hit_count
        return True

When hit the limit, you get something like this:

INFO {'status': 429, 'path': '/api/order/history/', 'content': '{detail: Request was throttled.Expected available in 16 seconds.}\n', 'method': 'GET', 'user': 100}