Wednesday, October 21, 2009

Character Encoding UTF-8 with JPA/Hibernate, MySql and Tomcat

I'm writing a little Java application using JPA (Hibernate implementation) and Spring. The application will run on Tomcat and uses MySql as the RDBMS.

The problem I had today was with the good old character encoding: I was able to store German Umlaut characters (üöä) properly in MySql, but whenever I retrieved them, they would be scrambled - regardless of whether I displayed the result on a web page or just printed it to the console.

So, the problem is: how to consistently set UTF-8 as the character encoding of choice throughout the whole stack:

  • for MySql as well as for any session coming through the JDBC driver in order to ensure that any entity created by Hibernate/JPA uses UTF-8
  • for Tomcat to make sure that any data served uses UTF-8

I know that you'll find a lot of material on the solution for each individual piece of software in my tech stack across the net. However, I still think it's worth to post this solution as I did not find all elements of it in one place (and don't want to search again the next time :-)

Here's what I did:

1. Ensure that MySql runs on UTF-8 as default: in the MySql configuration file my.cnf add the following in the section for mysqld:


2. Configure your MySql JDBC driver connection as follows (obviously hostname, port and schema are probably different in your configuration :-):


When configuring the above driver URL in your Spring XML context definition, don't forget to escape the Ampersand as you will get parsing errors otherwise.

3. Configure Tomcat for UTF-8 by adding the following line to your : catalina.bat or

JAVA_OPTS="$JAVA_OPTS -Djavax.servlet.request.encoding=UTF-8 -Dfile.encoding=UTF-8"

Versions I am using: MySql 5.0, Tomcat 6.0.20, Spring 2.5.6, Java 6, MySql Connector 5.1.6

Happy hacking!