Collation

Posted on: Wed, 11/29/2006 - 20:12 By: dae

My (for the living) work extensively use MySQL. I am not fond of it but it is the easiest choice of RDBMS out there. I used it together with delphi. Surely, many problem arise especially when Thai language is required.

A good delphi component for MySQL connection is DAC for MySQL. I have tried other connection, such as BDE, ADO, but DAC works best, in term of speed.

One of my major problems is about character set. The default character set for MySQL is latin1, which considers À, Á, Â, Ã to be the same. In traditional Thai encoding, À is equal to ภ while Á is equal to ม. Hence, these two thai characters are considered the same. If I force a Unique Key for a field, i can't have two rows with 'ภา' and 'มา' in that field, since MySQL will consider it to be a duplicated key.

An easy way to fix the problem is to use a special collation. The default collation is latin1_swedish_ci, simply change it to latin1_general_cs would be enough. I don't know why the default is swedish, though.

This can be done by following syntax

CREATE TABLE asdf {
  blah blah blah  <<---- field description
  blah blah blah  <<---- field description
  blah blah blah  <<---- field description
  blah blah blah  <<---- field description
} DEFAULT CHARSET=latin1 COLLATE=latin1_general_cs;

That's all.