So, as title of the questions says, I have a problem with encoding/decoding of strings.
I am using: python 2.7 | django 1.11 | jinja2 2.8
Basically, I am retrieving some data from data base, I serialize it, set cache on it, then get the cache, deserialize it and rendering it to the template.
Problem:
I have first names and last names of persons that have characters like “ă” in the names. I serialize using json.dumps.
A sample of serialized dictionary looks like (I have 10 like this):
active_agents = User.region_objects.get_active_agents() agents_by_commission_last_month = active_agents.values(.... "first_name", "last_name").order_by( '-total_paid_transaction_value_last_month')
Then, when I set the cache I do it like:
for key, value in context.items(): ...... value = json.dumps(list(value), default=str, ensure_ascii=False).encode('utf-8')
, where value is the list of dictionaries returned by .values()
from the aforementioned code and key is region_agents_by_commission_last_month
(like the variable from the previous code)
Now, I have to get the cache. So I am doing the same process, but reversed.
serialized_keys = ['agencies_by_commission_last_month', 'region_agents_by_commission_last_month', 'region_agents_by_commission_last_12_months', 'region_agents_by_commission_last_30_days', 'agencies_by_commission_last_year', 'agencies_by_commission_last_12_months', 'agencies_by_commission_last_30_days', 'region_agents_by_commission_last_year', 'agency', 'for_agent'] context = {} for key, value in region_ranking_cache.items(): if key in serialized_keys: objects = json.loads(value, object_hook=_decode_dict) for serilized_dict in objects: .... d['full_name'] = d['first_name'] + " " + d['last_name'] full_name = d['full_name'].decode('utf-8').encode('utf-8') d['full_name'] = full_name print(d['full_name']) ....
where _decode_dict for object_hook looks like:
The result from print: Cătălin Pintea , which is ok.
But in the dictionary I render: 'full_name': 'Cxc4x83txc4x83lin Pintea',
def _decode_list(data): rv = [] for item in data: if isinstance(item, unicode): item = item.encode('utf-8') elif isinstance(item, list): item = _decode_list(item) elif isinstance(item, dict): item = _decode_dict(item) rv.append(item) return rv def _decode_dict(data): rv = {} for key, value in data.items(): if isinstance(key, unicode): key = key.encode('utf-8') if isinstance(value, unicode): value = value.encode('utf-8') elif isinstance(value, list): value = _decode_list(value) elif isinstance(value, dict): value = _decode_dict(value) rv[key] = value return rv
Basically, I use this object hook function in order to encode() to utf-8 all keys and value when json.loads
.
This is how I avoided this error to be thrown in views.py
.
Error
Somewhere on template, I am using:
<td>{{ agent.full_name }}</td>
And agent.full_name comes from : 'full_name': 'Cxc4x83txc4x83lin Pintea',
Traceback
Traceback: File "/usr/local/lib/python2.7/dist-packages/django/core/handlers/exception.py" in inner 41. response = get_response(request) File "/usr/local/lib/python2.7/dist-packages/django/core/handlers/base.py" in _legacy_get_response 249. response = self._get_response(request) File "/usr/local/lib/python2.7/dist-packages/django/core/handlers/base.py" in _get_response 187. response = self.process_exception_by_middleware(e, request) File "/usr/local/lib/python2.7/dist-packages/django/core/handlers/base.py" in _get_response 185. response = wrapped_callback(request, *callback_args, **callback_kwargs) File "/usr/local/lib/python2.7/dist-packages/django/utils/decorators.py" in inner 185. return func(*args, **kwargs) File "/usr/local/lib/python2.7/dist-packages/django/contrib/auth/decorators.py" in _wrapped_view 23. return view_func(request, *args, **kwargs) File "/app/crmrebs/utils/__init__.py" in wrapper 255. return http_response_class(t.render(output, request)) File "/usr/local/lib/python2.7/dist-packages/django_jinja/backend.py" in render 106. return mark_safe(self.template.render(context)) File "/usr/local/lib/python2.7/dist-packages/jinja2/environment.py" in render 989. return self.environment.handle_exception(exc_info, True) File "/usr/local/lib/python2.7/dist-packages/jinja2/environment.py" in handle_exception 754. reraise(exc_type, exc_value, tb) File "/app/crmrebs/jinja2/ranking/dashboard_ranking.html" in top-level template code 1. {% extends "base.html" %} File "/app/crmrebs/jinja2/base.html" in top-level template code 1. {% extends "base_stripped.html" %} File "/app/crmrebs/jinja2/base_stripped.html" in top-level template code 94. {% block content %} File "/app/crmrebs/jinja2/ranking/dashboard_ranking.html" in block "content" 83. {% include "dashboard/region_ranking.html" %} File "/app/crmrebs/jinja2/dashboard/region_ranking.html" in top-level template code 41. {% include "dashboard/_agent_ranking_row_month.html" %} File "/app/crmrebs/jinja2/dashboard/_agent_ranking_row_month.html" in top-level template code 2. <td>{{ agent.full_name }}</td> Exception Type: UnicodeDecodeError at /ranking Exception Value: 'ascii' codec can't decode byte 0xc4 in position 1: ordinal not in range(128)
And this is from where the error comes. I tried other things, but I guess it is a limitation of python 2.7. I usually use python 3.9, but for this project I have to use 2.7. I tried other answers around here but nothing really helped.
Can anybody help me to serialize this dictionary properly and how can I avoid this mess?
I hope I made myself clear.
Have a nice day everyone !
Advertisement
Answer
So, I managed to solve my issue.
- I figured out that
active_agents.values(...."first_name", "last_name").order_by('-total_paid_transaction_value_last_month')
retrieved a dictionary where its key and values were already in unicode (bacause of the way it was configured in models.py, django 1.11 and python2.7. So, the process of serializing was just fine. It is indeed true that the final result that went to template was looking like’Cxc4x83txc4x83lin'
. The error came from /xc4/. - In order to fix it on template, I just did this:
{{ agent.full_name.decode(“utf-8”) }}, which gave me the right result:
Cătălin Pintea
Thanks @BoarGules. It was true that d['last_name']
and d['first_name']
were in unicode. So when I did the concatenation, I had to add u" "
.