Tuesday, March 13, 2012

Incorrect string value: ‘\xCE\x94F508…’ for column ‘value’ at row 1

Problem

Last week I was working on importing data from a vendors database into our database and for, what seemed like, random records I’d receive the error message above. Upon further inspection it turned out that the records that weren’t able to be inserted had special characters, like 'Diagnostic, ΔF508 and R117H with 5T/9T'. I knew I had set our MySQL database to use UTF-8 for the collation of the server by default, I’d even posted about it here.

I was able to resolve this, but first just a little background.

Solution

After researching this and even exporting and importing my database, I found that my tables were being created specifically using the latin1 character set. I was able to determine this when I did MySQL dump of my database and examined the CREATE TABLE syntax, as shown below:

DROP TABLE IF EXISTS `addresstypes`; 
CREATE TABLE `addresstypes` 
( `id` int(11) NOT NULL AUTO_INCREMENT, 
`text` varchar(255) DEFAULT '', 
`createdat` datetime DEFAULT NULL, 
`updatedat` datetime DEFAULT NULL, 
`deletedat` datetime DEFAULT NULL, 
`createdby` varchar(25) DEFAULT NULL, 
`updatedby` varchar(25) DEFAULT NULL, 
`deletedby` varchar(25) DEFAULT NULL, 
`rixid` varchar(255) DEFAULT NULL, 
PRIMARY KEY (`id`) 
) ENGINE=MyISAM AUTO_INCREMENT=6 DEFAULT CHARSET=latin1;

I found three ways to correct this. Here are the first two ways as I found at this blog post.

Single Table

Use the following SQL code block and replace the “database_name” and “table_name” values with your actual database name and table name:

ALTER TABLE `database_name`.`table_name` 
CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;

Multiple Tables

Use the following SQL code block and replace the “database_name” with your actual database name:

SELECT CONCAT('ALTER TABLE `', t.`TABLE_SCHEMA`, '`.`', t.`TABLE_NAME`, '` 
CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;') as stmt 
FROM `information_schema`.`TABLES` t 
WHERE 1 AND t.`TABLE_SCHEMA` = 'database_name' ORDER BY 1

I also found that if I performed a MySQL dump of the database, then replaced the latin1 character set command with utf8, deleted the database and then re-imported it this resolved the issue as well.

Use the following code block and replace the “database_name” with your actual database name. You’ll also need to DROP or DELETE the database after you do the MySQL dump in the first step:

mysqldump -u root -p database_name > database_name.sql 
sed 's/utf8_bin/utf8_unicode_ci/' database_name.sql > database_name_fix.sql 
sed 's/CHARSET=latin1;/CHARSET=utf8;/' database_name_fix.sql > database_name.sql 
mysql -u root -p database_name < database_name.sql

Interesting Note

Interestingly if I copied and pasted the SQL INSERT syntax into the MySQL command-line, before I changed the tables to be UTF-8, the command was successful at inserting the data.

Update: MySQL 5.5.3+ Server Options

These options have changed in MySQL 5.5.3 and newer, in particular the "default-collation" and "default-character-set" server options have been deprecated. Instead, the following server options replace these "collation-server" and "character-set-server"

No comments: