Rapid Database Application Development
	Development	"Microsoft Access for Linux"

Home Download FAQ Support Features Handbook Screenshots Screencasts Compile Kexi Development Authors Contact License Sponsorship Translate This Site

wiki navigation:

Front Page
[info] [diff] [login]
[recent changes]
[most popular]

You can donate
to Kexi Project:
Via PayPal

Spread the word about Kexi!

Built on the KDE technology
KDE

Member of the Calligra Suite

KexiDBDesign:

Database Collation Issues

 started by: jstaniek
 status of ideas mentioned here: planed for 1.1

Table of Contents
   Introduction
     Types
     Affected SQL Operators
     Implications For Data Integrity
     Problem with Non-latin1 Character Sets
   Collation in SQLite Backend
   Collation in MySQL Backend
   Collation in PostgreSQL Backend
   Implications For Query Designer
   Competition

Introduction

^ toc

Types

Collation type is a rule used when comparing (or sorting) text strings in a database. Lets assume we have three database records in a name column of a table:

 Joe
 Adam
 joan

NOCASE collation means that a comparison like name > "adam" will return:

 Joe
 Adam
 joan

In other words, case-insensitive comparison is performed.

BINARY collation means that a comparison like name > "adam" will return:

 joan

Case-sensitive comparison is performed here, so neither "Joe" nor "Adam" are greater than "adam".

^ toc

Affected SQL Operators

Following SQL comparison operators are affected:

 < > <= >= <> != = == LIKE.

^ toc

Implications For Data Integrity

Example for MySQL:

 create table dictionary (
  word varchar(255) COLLATE latin2_bin NOT NULL,
 );
 ALTER TABLE dictionary ADD UNIQUE INDEX (word);

Having BINARY collation latin2_bin defined now it's possible to insert

 Bag
 bag

while, with NOCASE collation, the unique index would not allow this.

^ toc

Problem with Non-latin1 Character Sets

Daabase backend needs to offer built-in support for Non-latin1 character collations. Kexi could not implement a workaround at the client side if a query has to be executed at the server side.

^ toc

Collation in SQLite Backend

SQLite supports BINARY collation by default.

Non-latin1 Character Sets: we can patch SQLite to add such a support. See ser-defined Collation Sequences.

^ toc

Collation in MySQL Backend

'By default, MySQL searches are not case sensitive.

See http://dev.mysql.com/doc/refman/5.0/en/case-sensitivity.html See http://dev.mysql.com/doc/refman/5.0/en/charset-mysql.html

To force BINARY collation, we will add COLLATE clause, e.g.:

 select * from table where name>='abc' COLLATE latin1_bin

Notes: latin1_bin collation name always works. Collation and character sets can be properly selected on db creation or connecting only for MySQL >= 4.1. See also Table Character Set and Collation (mysql manual). So for newer MySQL versions use specific collations, not just latin1_bin.

Good default for collation/character set seem to be: utf8_general_ci/utf8.

TODO

^ toc

Collation in PostgreSQL Backend

TODO

^ toc

Implications For Query Designer

Good default is a #1 priority here. In most cases users expect NOCASE collation to be the default, so we're:

adding NOCASE flag to appropriate KexiDB::BinaryExpr objects when SQL statement is generated/anayzed
supporting BINARY keyword in the Design View of the Query Designer

^ toc

Competition

MS Access does not support case-sensitive queries.

Kexi - "MS Access for Linux" ... and Windows

This content is available under GFDL

Last edited: January 11, 2006 by js, visited 0 times.