Thursday, May 11, 2006

UTF-8 Encoding fix for MySQL (Tomcat, JSP)

In my previous post, I talked about how to get international characters to display properly on your jsp pages.

This post is going to talk about how to make sure the international characters posted through an html form gets saved in and retrieved from the MySQL database with UTF-8 encoding.

You know the case where you submit 'alımlı' in your form, but when you check the value stored in your database table, it becomes 'al?ml?'!


For a great explanation of what's going on behind the scenes, read 'CHARSET CONVERSION FROM BROWSER TO DATABASE' section on this page.

The required steps to overcome this problem are as follows:

  • Make sure you do everything explained here
  • .
  • Make sure your database and/or table and/or field is defined with character set UTF-8. Collation plays a role when comparing values, pick the one that fits your target language and pick the generic one.

  • In {tomcat dir}/conf/server.xml, the connector configuration should have 'URIEncoding=UTF-8'. For example:


    <Connector port="7000" maxHttpHeaderSize="8192"
    maxThreads="150" minSpareThreads="25" maxSpareThreads="75"
    enableLookups="false" redirectPort="8443" acceptCount="100"
    URIEncoding="UTF-8"
    connectionTimeout="20000" disableUploadTimeout="true" />


    This step is required if you will use 'get' as a form submission method. But it doesn't hurt to set it in any case.


  • Your database connection string should follow the format:
    url="jdbc:mysql://localhost:3306/{database name}?autoReconnect=true&useEncoding=true&characterEncoding=UTF-8"

  • THIS IS THE MOST IMPORTANT BIT OF INFO: Make sure to start your mysql server with the '--default-character-set=utf8' parameter. For example, on my system (MacOSX), I start the server with './safe_mysqld --default-character-set=utf8.'



And that's it! If you are still having problems, send me an email and I will try to assist you further.

Friday, May 05, 2006

UTF-8 Encoding fix (Tomcat, JSP, etc)

I spent a whole day trying to get non-ascii characters to display properly in my JSP pages.
To save anyone from spending similarly frustrating hours, here's the solution to get to display those characters in your JSP page.

First please read this so that you understand what the concept of encoding is.

While trying to solve my problem I collected couple of links, you can browse them here.

So my setup is as follows:
  • Tomcat Application Server (5.5.17)
  • Stripes web framework.
  • Front-end implementation JSP (using Stripes' layout functionality).
  • OS: MacOSX
Two main problems:
  1. Get to display non-ascii characters (e.g. ç,ğ,ö,ş,ı, etc) in the jsp file when they are typed directly inside the jsp.
  2. Get to display these characters when read from an application resources file (for example StripesResources.properties for Stripes).

Ok let's begin...
First make sure you save all your files (jsps, application resources files) in UTF-8 encoding. In Dreamweaver for example, Ctrl-J (or Apple-J) will bring up the window to set that.

Solution to problem 1:

I may have overkilled here, but this setup works, so you may adopt the IIWDQ ('if it works don't question') approach.

  • Place '<%@ page language="java" pageEncoding="utf-8" contentType="text/html;charset=utf-8"%>' as the first line in 'ALL' the jsps.

    If you are using a layout manager, similar to Stripes layout, you may think 'hey I'll just put it in the layout page that way it will work for all my pages'..THINK AGAIN. IT WON'T.

    You may also say 'hey wait I have a great idea, I have this include.jsp where I declare all the taglibs, I'll place this directive in that file...All my jsps include that file, so it will work'. To that I'll say NOPE.


  • Place <meta equiv="Content-Type" content="text/html; charset=UTF-8"> under <head>. This is to give browsers an idea about the content of the page so they can display the contents properly.
  • Write an Encoding filter and make sure all your requests pass through it. Not difficult at all. Here it is:




import java.io.IOException;

import javax.servlet.Filter;
import javax.servlet.FilterChain;
import javax.servlet.FilterConfig;
import javax.servlet.ServletException;
import javax.servlet.ServletRequest;
import javax.servlet.ServletResponse;

public class EncodingFilter implements Filter {
private String encoding;
private FilterConfig filterConfig;

/**
* @see javax.servlet.Filter#init(javax.servlet.FilterConfig)
*/
public void init(FilterConfig fc) throws ServletException {
this.filterConfig = fc;
this.encoding = filterConfig.getInitParameter("encoding");
}

/**
* @see javax.servlet.Filter#doFilter(javax.servlet.ServletRequest, javax.servlet.ServletResponse, javax.servlet.FilterChain)
*/
public void doFilter(ServletRequest req, ServletResponse resp,
FilterChain chain) throws IOException, ServletException {
req.setCharacterEncoding(encoding);
chain.doFilter(req, resp);
}

/**
* @see javax.servlet.Filter#destroy()
*/
public void destroy() {
}

}


The way you let your web application know about this filter is via the web.xml file:

<filter>
<filter-name>EncodingFilter</filter-name>
<filter-class>com.yourpackagestructurehere.EncodingFilter</filter-class>
<init-param>
<param-name>encoding</param-name>
<param-value>UTF-8</param-value>
</init-param>
</filter>

<filter-mapping>
<filter-name>EncodingFilter</filter-name>
<url-pattern>/*</url-pattern>
</filter-mapping>




At this stage, if you type something along the lines of 'çanak çömlek patladı' in your jsp and run the web application, you should see it in your browser...

Are we done? Not yet. Because if you have something like <fmt:message key="username"> in your jsp and your resource properties file contains username=Kullanıcı Adı, you will end up
with something like 'Kullan?c? Ad?'...For that see:

Solution to problem 2:
I know you saved your ApplicationResources.properties (or StripesResources.properties, or xxx.properties) file in UTF-8. That should display fine right? Well wrong. It does not. But it will if you :
  1. Copy your ApplicationResources.properties file to something like ApplicationResources.properties.org.
  2. run 'native2ascii -encoding UTF-8 ApplicationResources.properties.org ApplicationResources.properties'
  3. Deploy your files...


And ta-ta! (At least for me it was 'ta-ta' at this stage)...

Special thanks to cleverpig, mj and Rick Smith from Stripes mailing list for their help on this subject.

Ha by the way, if you are not using Stripes yet, it's time you start using it.

Friday, April 14, 2006

Nice Java Web Framework

After long technology evaluation sessions, I decided to use Stripes as the web framework for my next project.

I went through so many iterations along the way: JSF, Wicket, Rife and even Ruby on Rails.
But at the end Stripes won my heart for its simplicity and easy Spring integration.
I must also mention that lack of xml configuration files was a big point for Stripes as
well.

At times I find myself thinking : 'why am I killing myself to learn yet another framework? Why not use 'xxx framework' all over again?'. I don't have a very good answer to these questions. I guess most of it is curiosity, some of it is the need to feel like I'm not falling behind following new trends and some of it is just that doing a new project with a framework that I already know how to use is just plain boring.

Hmmm...So doesn't that last point go against productivity? 'Perform your task with the tools you know how to use best. That way you'll produce good solutions in shorter time'. I guess it does a bit. But who cares? System development must be fun. And I'm having most fun when I'm learning new things.

Having said all of that, I think I would have killed Tomcat if it was a man rather than an application server. The damn thing refuses to accept that I want my pages processed with UTF-8!!! Listen to me: UTF-8 I said!.

Back to that now...

Friday, March 10, 2006

DRY or WET

ServerSide is a great resource to get news on the latest trends in the Java world. It also has a .NET version ( I'm not going to even provide the link to that) that covers the 'other' side of the moon.

In a recent posting about the Wicket framework, I got introduced to the new WET principal.

In response to a 'Does Wicket violate the DRY principle?' question, someone jokingly suggested the WET principal. For those of you who are not familiar with these two principals, DRY stands for 'Don't Repeat Yourself'; the newly suggested WET stands for 'Write Everything Twice'.

Seriously though, what in the world is going on with all these Java Web frameworks? It seems to me the quest for DRY is causing the creation of WAF ('Write Another Framework'). I recently counted fifty-five of them. FIFTY-FIVE! I would tend to think that I would be a bit confusing for the newcomers to the Java world. 'Let me see. Should I write my new web application in Java, using one of the 55 frameworks available, thus spending 7 weeks investigating which one to use.. Or wait....RUBY?'.

I don't know whether the aim is to stay DRY or go WET, what I'm sure of is NAF!

Wednesday, March 08, 2006

What makes a good software system?

Today I read a very good article partly about good programming practices. The article's content is broader than that, but what I got out of it was the latter.

In my opinion, what makes a good system is the coherence of small components that only know how to perform their own task without knowing not much, preferably nothing, about the big system. Similar to the infamous saying attributed to Einstein 'everything should be made as simple as possible, but not one bit simpler', maybe we can postulate 'components should be programmed to be as dumb as possible, but not one bit dumber'.

Another great concept to read about is John Conway's Game of Life. How amazing it is to see complex behaviour arising from a few simple rules.

Along the same lines, you may have also heard of Steven Wolfram's book A New Kind of Science.

Monday, March 06, 2006

Expertise

I read a perfect article about 'how to be an expert' today. And the most notable section of if was this quote from Dr. K. Anders Ericsson:

"For the superior performer the goal isn't just repeating the same thing again and again but achieving higher levels of control over every aspect of their performance. That's why they don't find practice boring. Each practice session they are working on doing something better than they did the last time."

I think what separates good coders from great ones is this little point: Great coders are never happy with what they know and the way they know how to do things.

Thursday, March 02, 2006

Spore

I just watched this video presentation of the video game Spore by Will Wright.

I was quite impressed by it. You can get more information about Spore at their website.

Monday, January 02, 2006

First One

This is my first blog entry. A close friend of mine has been suggesting that I maintain a blog. So here it goes...

I am a software engineer. I use the web mostly to read about emerging technologies, frameworks, and to search for answers to help my debugging sessions.

One of the sites I recently became addicted to is Reddit. I am not addicted to it simply because I am curious about what other people are reading, but I always find there an interesting list of articles. I also think it's a geek-oriented list. So it suits me fine.

Lately I'm amazed about the Web 2.0 (or should I say WTF 2.0) hype. I see a lot of innovative solutions out there, but they are mostly showcasing extents of Ajax technology. I think the real winner of this new movement is JavaScript. Great libraries (prototype, script.aculo.us, etc) have emerged to help this bad boy of programming languages get a facelift. Why do I call Javascript a bad boy? Well, the answer is simple really. Ask any programmer friend you have what they think of javascript. And watch how many grimaces s/he makes before answering. Javascript is, or was, hell to write code in.

Other boosters of Javascript among coders have been Konfabulator and of course Mac OSX widgets.

Which language do I write most of my code in? Well, I use Java...But lately I started getting very annoyed with some of the things that have been going on in the Java world. But that's the subject of another blog.