I have a custom EncryptedCharField, which I want to basically appear as a CharField when interfacing UI, but before storing/retrieving in the DB it encrypts/decrypts it.
The custom fields documentation says to:
- add
__metaclass__ = models.SubfieldBase
- override to_python to convert the data from it’s raw storage into the desired format
- override get_prep_value to convert the value before storing ot the db.
So you think this would be easy enough – for 2. just decrypt the value, and 3. just encrypt it.
Based loosely on a django snippet, and the documentation this field looks like:
class EncryptedCharField(models.CharField):
"""Just like a char field, but encrypts the value before it enters the database, and decrypts it when it
retrieves it"""
__metaclass__ = models.SubfieldBase
def __init__(self, *args, **kwargs):
super(EncryptedCharField, self).__init__(*args, **kwargs)
cipher_type = kwargs.pop('cipher', 'AES')
self.encryptor = Encryptor(cipher_type)
def get_prep_value(self, value):
return encrypt_if_not_encrypted(value, self.encryptor)
def to_python(self, value):
return decrypt_if_not_decrypted(value, self.encryptor)
def encrypt_if_not_encrypted(value, encryptor):
if isinstance(value, EncryptedString):
return value
else:
encrypted = encryptor.encrypt(value)
return EncryptedString(encrypted)
def decrypt_if_not_decrypted(value, encryptor):
if isinstance(value, DecryptedString):
return value
else:
encrypted = encryptor.decrypt(value)
return DecryptedString(encrypted)
class EncryptedString(str):
pass
class DecryptedString(str):
pass
and the Encryptor looks like:
class Encryptor(object):
def __init__(self, cipher_type):
imp = __import__('Crypto.Cipher', globals(), locals(), [cipher_type], -1)
self.cipher = getattr(imp, cipher_type).new(settings.SECRET_KEY[:32])
def decrypt(self, value):
#values should always be encrypted no matter what!
#raise an error if tthings may have been tampered with
return self.cipher.decrypt(binascii.a2b_hex(str(value))).split('')[0]
def encrypt(self, value):
if value is not None and not isinstance(value, EncryptedString):
padding = self.cipher.block_size - len(value) % self.cipher.block_size
if padding and padding < self.cipher.block_size:
value += "" + ''.join([random.choice(string.printable) for index in range(padding-1)])
value = EncryptedString(binascii.b2a_hex(self.cipher.encrypt(value)))
return value
When saving a model, an error, Odd-length string, occurs, as a result of attempting to decrypt an already decrypted string. When debugging, it appears as to_python ends up being called twice, the first with the encrypted value, and the second time with the decrypted value, but not actually as a type Decrypted, but as a raw string, causing the error. Furthermore get_prep_value is never called.
What am I doing wrong?
This should not be that hard – does anyone else think this Django field code is very poorly written, especially when it comes to custom fields, and not that extensible? Simple overridable pre_save and post_fetch methods would easily solve this problem.
Advertisement
Answer
I think the issue is that to_python is also called when you assign a value to your custom field (as part of validation may be, based on this link). So the problem is to distinguish between to_python calls in the following situations:
- When a value from the database is assigned to the field by Django (That’s when you want to decrypt the value)
- When you manually assign a value to the custom field, e.g. record.field = value
One hack you could use is to add prefix or suffix to the value string and check for that instead of doing isinstance check.
I was going to write an example, but I found this one (even better :)).
Check BaseEncryptedField: https://github.com/django-extensions/django-extensions/blob/2.2.9/django_extensions/db/fields/encrypted.py (link to an older version because the field was removed in 3.0.0; see Issue #1359 for reason of deprecation)
Source: Django Custom Field: Only run to_python() on values from DB?