The Proper Way To Use UTF-8 (PHP/MySQL)

This post was written by Chris Latko on July 2, 2009
Posted Under: PHP

phpAfter living in Japan for six years and doing web programming for most of that time, you would think I would have this down by now. I used many combos - from Lasso/FileMaker to PHP/MSSQL and even PHP/PostgreSQL - but never used PHP/MySQL for any CJVK work.  So I did some Googling and found four pages that claimed to have the answer:

  1. Use UTF-8 No BOM for each page. That is Byte-Order Mark, which does help in other languages like Cold Fusion, but not for me in PHP. NOPE!
  2. Use a PHP header tag:
    header('Content-Type: text/html; charset=utf-8');

    and use a HTML meta tag:

    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

    NOPE!

  3. Use SET NAMES ‘utf8′; when instantiating your database object. NOPE!
  4. Change the column to utf8_general_ci and the collation to utf8_general_ci. NOPE!

I saw that PhpMyAdmin was displaying the characters correctly, so how were they doing it? I did a deep dive into the code and wound up at the mysql dbi connector where the following statements were set for EVERY query:

mysqliObj->query("SET CHARACTER SET 'utf8'");
mysqliObj->query("SET collation_connection = 'utf8_general_ci'");

This, along with the column set to utf8_general_ci did the trick. The processing pages were set to Western (Mac OS Roman) which did not cause any problems inserting or displaying Japanese data.

This post is more for myself so I don’t ever forget how this is done. This can be black magic sometimes so I need some documentation.

6 Tweets

Comments RSSComments

You are my hero. I have been struggling with this for years. I have noticed that phpMyAdmin manages to display the data correctly

#1 
Written By Jonathan Stark on July 3rd, 2009 @ 12:01 am

The collation is not needed btw, thats just for sorting

#2 
Written By Maarten on July 3rd, 2009 @ 1:09 pm

Good to know. I realized it worked without the collation, but included it for safe measure. Thanks.

#3 
Written By Chris Latko on July 3rd, 2009 @ 2:03 pm

Could you please show a full php code for using
mysqliObj->query("SET CHARACTER SET 'utf8'");
mysqliObj->query("SET collation_connection = 'utf8_general_ci'");
and passing date from browser to mysql.

I am just learning this stuff and need to use multilingual data on a website wih mysql but after a couple days of research I can not seem to get it working. I could not find any help on msqliObj-> eiher.

Thanks, this would be very appreciated.

#4 
Written By Ehab on August 26th, 2009 @ 4:42 am

Apologies for getting back to you so late. I've been on a national tour for the past 16 days.

Here is a code snippet from a DB class I frequently use (the code below is getting garbled a bit):

public function read($query) {
$this->mysqliObj_r->query('SET CHARACTER SET 'utf8'');
$this->mysqliObj_r->query('SET collation_connection = 'utf8_general_ci'');
if(!$resultObj = $this->mysqliObj_r->query($query)) {
$error = 'Query Error: '.$this->mysqliObj_r->error."n".'SQL: '.$query;
echo $error.'';
}
return $resultObj;
} 

The mysliObj is an internal construct of my class. For more information, check out PHP’s docs on mysqli.

#5 
Written By Chris Latko on September 1st, 2009 @ 7:47 pm

Good to know. I realized it worked without the collation, but included it for safe measure. Thanks.

#6 
Written By DR on October 5th, 2009 @ 12:14 pm

Spewed a new blog post: The Proper Way To Use UTF-8 (PHP/MySQL) http://bit.ly/ysLDv

This comment was originally posted on Twitter

#7 
Written By clatko on July 2nd, 2009 @ 6:57 pm

Spewed a new blog post: The Proper Way To Use UTF-8 (PHP/MySQL) http://bit.ly/ysLDv (via @clatko)

This comment was originally posted on Twitter

#8 
Written By jonathanstark on July 2nd, 2009 @ 7:02 pm

Add a Comment

required, use real name
required, will not be published
optional, your blog address

Additional comments powered by BackType