Problem
Last week I was working on importing data from a vendors database into our database and for, what seemed like, random records I’d receive the error message above. Upon further inspection it turned out that the records that weren’t able to be inserted had special characters, like 'Diagnostic, ΔF508 and R117H with 5T/9T'
. I knew I had set our MySQL database to use UTF-8 for the collation of the server by default, I’d even posted about it here.
I was able to resolve this, but first just a little background.
- The vendors database is an Oracle 10g system that uses UTF-8 as the character set.
- They were providing me data dumps of each table as Microsoft Excel files. I would open these using Numbers, then export them as CSV and specify using UTF-8 as the character set.
- The MySQL database I was working with had been set to use UTF-8 for it’s default collation
-
I was uploading the CSV files using ColdFusion Markup Language (CFML) running on the Railo application server and the ColdFusion on Wheels framework. I had set the character set in Railo to use UTF-8
- Railo Web Administrator > Datasource > mydsn > “Use Unicode” to true
- Railo Web Administrator > Datasource > mydsn > “Charset” = UTF-8
Solution
After researching this and even exporting and importing my database, I found that my tables were being created specifically using the latin1 character set. I was able to determine this when I did MySQL dump of my database and examined the CREATE TABLE syntax, as shown below:
DROP TABLE IF EXISTS `addresstypes`; CREATE TABLE `addresstypes` ( `id` int(11) NOT NULL AUTO_INCREMENT, `text` varchar(255) DEFAULT '', `createdat` datetime DEFAULT NULL, `updatedat` datetime DEFAULT NULL, `deletedat` datetime DEFAULT NULL, `createdby` varchar(25) DEFAULT NULL, `updatedby` varchar(25) DEFAULT NULL, `deletedby` varchar(25) DEFAULT NULL, `rixid` varchar(255) DEFAULT NULL, PRIMARY KEY (`id`) ) ENGINE=MyISAM AUTO_INCREMENT=6 DEFAULT CHARSET=latin1;
I found three ways to correct this. Here are the first two ways as I found at this blog post.
Single Table
Use the following SQL code block and replace the “database_name” and “table_name” values with your actual database name and table name:
ALTER TABLE `database_name`.`table_name` CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
Multiple Tables
Use the following SQL code block and replace the “database_name” with your actual database name:
SELECT CONCAT('ALTER TABLE `', t.`TABLE_SCHEMA`, '`.`', t.`TABLE_NAME`, '` CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;') as stmt FROM `information_schema`.`TABLES` t WHERE 1 AND t.`TABLE_SCHEMA` = 'database_name' ORDER BY 1
I also found that if I performed a MySQL dump of the database, then replaced the latin1
character set command with utf8
, deleted the database and then re-imported it this resolved the issue as well.
Use the following code block and replace the “database_name” with your actual database name. You’ll also need to DROP or DELETE the database after you do the MySQL dump in the first step:
mysqldump -u root -p database_name > database_name.sql sed 's/utf8_bin/utf8_unicode_ci/' database_name.sql > database_name_fix.sql sed 's/CHARSET=latin1;/CHARSET=utf8;/' database_name_fix.sql > database_name.sql mysql -u root -p database_name < database_name.sql
Interesting Note
Interestingly if I copied and pasted the SQL INSERT syntax into the MySQL command-line, before I changed the tables to be UTF-8, the command was successful at inserting the data.
Update: MySQL 5.5.3+ Server Options
These options have changed in MySQL 5.5.3 and newer, in particular the "default-collation" and "default-character-set" server options have been deprecated. Instead, the following server options replace these "collation-server" and "character-set-server"
No comments:
Post a Comment