How can I create an encrypted django field that converts data when it’s retrieved from the database?

Tags: , ,

I have a custom EncryptedCharField, which I want to basically appear as a CharField when interfacing UI, but before storing/retrieving in the DB it encrypts/decrypts it.

The custom fields documentation says to:

  1. add __metaclass__ = models.SubfieldBase
  2. override to_python to convert the data from it’s raw storage into the desired format
  3. override get_prep_value to convert the value before storing ot the db.

So you think this would be easy enough – for 2. just decrypt the value, and 3. just encrypt it.

Based loosely on a django snippet, and the documentation this field looks like:

class EncryptedCharField(models.CharField):
  """Just like a char field, but encrypts the value before it enters the database, and    decrypts it when it
  retrieves it"""
  __metaclass__ = models.SubfieldBase
  def __init__(self, *args, **kwargs):
    super(EncryptedCharField, self).__init__(*args, **kwargs)
    cipher_type = kwargs.pop('cipher', 'AES')
    self.encryptor = Encryptor(cipher_type)

  def get_prep_value(self, value):
     return encrypt_if_not_encrypted(value, self.encryptor)

  def to_python(self, value):
    return decrypt_if_not_decrypted(value, self.encryptor)

def encrypt_if_not_encrypted(value, encryptor):
  if isinstance(value, EncryptedString):
    return value
    encrypted = encryptor.encrypt(value)
    return EncryptedString(encrypted)

def decrypt_if_not_decrypted(value, encryptor):
  if isinstance(value, DecryptedString):
    return value
    encrypted = encryptor.decrypt(value)
    return DecryptedString(encrypted)

class EncryptedString(str):

class DecryptedString(str):

and the Encryptor looks like:

class Encryptor(object):
  def __init__(self, cipher_type):
    imp = __import__('Crypto.Cipher', globals(), locals(), [cipher_type], -1)
    self.cipher = getattr(imp, cipher_type).new(settings.SECRET_KEY[:32])

  def decrypt(self, value):
    #values should always be encrypted no matter what!
    #raise an error if tthings may have been tampered with
    return self.cipher.decrypt(binascii.a2b_hex(str(value))).split('')[0]

  def encrypt(self, value):
    if value is not None and not isinstance(value, EncryptedString):
      padding  = self.cipher.block_size - len(value) % self.cipher.block_size
      if padding and padding < self.cipher.block_size:
        value += "" + ''.join([random.choice(string.printable) for index in range(padding-1)])
      value = EncryptedString(binascii.b2a_hex(self.cipher.encrypt(value)))
    return value

When saving a model, an error, Odd-length string, occurs, as a result of attempting to decrypt an already decrypted string. When debugging, it appears as to_python ends up being called twice, the first with the encrypted value, and the second time with the decrypted value, but not actually as a type Decrypted, but as a raw string, causing the error. Furthermore get_prep_value is never called.

What am I doing wrong?

This should not be that hard – does anyone else think this Django field code is very poorly written, especially when it comes to custom fields, and not that extensible? Simple overridable pre_save and post_fetch methods would easily solve this problem.


I think the issue is that to_python is also called when you assign a value to your custom field (as part of validation may be, based on this link). So the problem is to distinguish between to_python calls in the following situations:

  1. When a value from the database is assigned to the field by Django (That’s when you want to decrypt the value)
  2. When you manually assign a value to the custom field, e.g. record.field = value

One hack you could use is to add prefix or suffix to the value string and check for that instead of doing isinstance check.

I was going to write an example, but I found this one (even better :)).

Check BaseEncryptedField: (link to an older version because the field was removed in 3.0.0; see Issue #1359 for reason of deprecation)

Source: Django Custom Field: Only run to_python() on values from DB?

Source: stackoverflow