You are here: Home / What do you need? / Help and documentation / Plone Info / The dreaded UnicodeDecodeError in a Plone script!

The dreaded UnicodeDecodeError in a Plone script!

by Darrell Kingsley last modified Mar 13, 2014 01:03 PM
My all time favourite error.... not. Here's what is usually happening and how to avoid it.

Sometimes we'll have string data fields that contains non-ascii characters such as umlauts which are probably UTF-8 encodings outside the 128 character ascii range.

txt = member.getProperty('fullname')

When these are assigned to standard byte strings then they won't be recognised if the default system codec is ascii and result in this error...

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 33: ordinal not in range(128)

First off try using using the 'utf8' decoder

 

txt = member.getProperty('fullname').decode('utf8')

 

if this is no good, try storing as a unicode string instead of a byte string and tell the unicode to expect some utf-8 characters (a superset of ascii)

utxt = unicode(member.getProperty('fullname'),"utf-8")

 

However, when you need to use a normal python byte string again (ie to write the content to a text/csv file), you might come across an error telling you that it doesn't like unicode such as

AttributeError: 'unicode' object has no attribute 'seek'

 

In which case just convert it back to a normal byte string using the utf-8 encoding.

txt=utxt.encode("utf-8","ignore")

or in the case of a creating a plone text file

container.invokeFactory(type_name='File',
 id=filename,
 title=filename,
 file=txt.encode("utf-8","ignore"),

Don't forget, that when viewing the text file you'll have to select the Unicode utf-8 character set otherwise you'll see some squiggly gibberish wherever these characters are. 

Go away you pesky unicodedecodeerrors.