I have a custom EncryptedCharField, which I want to basically appear as a CharField when interfacing UI, but before storing/retrieving in the DB it encrypts/decrypts it.
The custom fields documentation says to:
- add
__metaclass__ = models.SubfieldBase
- override to_python to convert the data from it’s raw storage into the desired format
- override get_prep_value to convert the value before storing ot the db.
So you think this would be easy enough – for 2. just decrypt the value, and 3. just encrypt it.
Based loosely on a django snippet, and the documentation this field looks like:
class EncryptedCharField(models.CharField): """Just like a char field, but encrypts the value before it enters the database, and decrypts it when it retrieves it""" __metaclass__ = models.SubfieldBase def __init__(self, *args, **kwargs): super(EncryptedCharField, self).__init__(*args, **kwargs) cipher_type = kwargs.pop('cipher', 'AES') self.encryptor = Encryptor(cipher_type) def get_prep_value(self, value): return encrypt_if_not_encrypted(value, self.encryptor) def to_python(self, value): return decrypt_if_not_decrypted(value, self.encryptor) def encrypt_if_not_encrypted(value, encryptor): if isinstance(value, EncryptedString): return value else: encrypted = encryptor.encrypt(value) return EncryptedString(encrypted) def decrypt_if_not_decrypted(value, encryptor): if isinstance(value, DecryptedString): return value else: encrypted = encryptor.decrypt(value) return DecryptedString(encrypted) class EncryptedString(str): pass class DecryptedString(str): pass
and the Encryptor looks like:
class Encryptor(object): def __init__(self, cipher_type): imp = __import__('Crypto.Cipher', globals(), locals(), [cipher_type], -1) self.cipher = getattr(imp, cipher_type).new(settings.SECRET_KEY[:32]) def decrypt(self, value): #values should always be encrypted no matter what! #raise an error if tthings may have been tampered with return self.cipher.decrypt(binascii.a2b_hex(str(value))).split('')[0] def encrypt(self, value): if value is not None and not isinstance(value, EncryptedString): padding = self.cipher.block_size - len(value) % self.cipher.block_size if padding and padding < self.cipher.block_size: value += "" + ''.join([random.choice(string.printable) for index in range(padding-1)]) value = EncryptedString(binascii.b2a_hex(self.cipher.encrypt(value))) return value
When saving a model, an error, Odd-length string, occurs, as a result of attempting to decrypt an already decrypted string. When debugging, it appears as to_python ends up being called twice, the first with the encrypted value, and the second time with the decrypted value, but not actually as a type Decrypted, but as a raw string, causing the error. Furthermore get_prep_value is never called.
What am I doing wrong?
This should not be that hard – does anyone else think this Django field code is very poorly written, especially when it comes to custom fields, and not that extensible? Simple overridable pre_save and post_fetch methods would easily solve this problem.
Advertisement
Answer
I think the issue is that to_python is also called when you assign a value to your custom field (as part of validation may be, based on this link). So the problem is to distinguish between to_python calls in the following situations:
- When a value from the database is assigned to the field by Django (That’s when you want to decrypt the value)
- When you manually assign a value to the custom field, e.g. record.field = value
One hack you could use is to add prefix or suffix to the value string and check for that instead of doing isinstance check.
I was going to write an example, but I found this one (even better :)).
Check BaseEncryptedField: https://github.com/django-extensions/django-extensions/blob/2.2.9/django_extensions/db/fields/encrypted.py (link to an older version because the field was removed in 3.0.0; see Issue #1359 for reason of deprecation)
Source: Django Custom Field: Only run to_python() on values from DB?