Converting character sets in an Oracle database

I recently had to correct text data stored in an Oracle database with a wrong character set. A bug in one of our applications caused text data to be stored in the ISO 8859-1 West European character encoding. The database, however, is configured to work with UTF-8 encoded strings. The bug was fixed, but the existing data had to be corrected afterwards. If you ever encounter the same problem you can use Oracle’s CONVERT() function to fix the data. The syntax is as follows:

CONVERT(text, to_charset, from_charset)

First you have to find out Oracle’s names of the desired source and target character sets. You can look them up in this table. In my case the the character set names are 'UTF8' and 'WE8ISO8859P1' (for Western European 8-bit ISO-8859-1). You can check the function by selecting from DUAL:

SELECT CONVERT('ä ö ü Ä Ö Ü ß', 'UTF8', 'WE8ISO8859P1') FROM DUAL;

If the result shows the same scrambled characters you see in the data you know that you have chosen the correct character set names:

The other direction should unscramble the characters again:

Finally, you can convert your data:

UPDATE example_table
  SET text_column=CONVERT(text_column, 'WE8ISO8859P1', 'UTF8');

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.