I love being married to an IT guy who focuses on “the other stuff”. He’s truly awesome when it comes to system admin stuff, network admin stuff, and anything related to the hardware. When it comes to writing code and some database stuff, though, he relies on me for a little guidance sometimes.
Tonight, he was looking at phpMyAdmin and wondering why he had “100s of options” in there. When I asked him to elaborate, he mentioned “latin… utf…”. Ahh… collations…
The Anatomy of a Collation
A collation explains the patterns in a data set. This includes some behavioral rules. Some of the key parts to note include:
- Code Page – controls how to store the non-Unicode data
- Options – determines how data is treated
- _CS / _CI : case sensitive / case insensitive – determines whether the letters’ case makes a difference in sorting.
- _AS/ _AI: accent sensitive / accent insensitive – determines whether the letters’ accenting makes a difference in sorting. For example in an _AI setup, é would be the same as e. However, in an _AS setup, that would not be true.
- _KS / (omitted): Kana-sensitive – determines how Hiragana and Katakana are handled. This applies to Japanese kana characters. If _KS is not present, it is Kana-insensitive.
- _WS / (omitted) : width-sensitive – determines full-width and half-width characters sorting. If _WS is not present, it is width-insensitive.