Direkt zum Hauptbereich

Lazarus IDE, Oracle 11g and German Umlaute

Working with a database can be so easy - as long as you don't need to care about localization or languages with more or other letters than english. One of these Languages is German. And here the problem starts...

Introduction

Let's look at a simple scenario: Inserting a text with German letters like ä, ö, ü and ß into the Oracle 11g database. And right now there already could be a problem when trying to select it again. So what is the problem here?

Actually the problem itself is very simple: The character encoding. However the solution is not as simple. I tried to connect to Oracle via an ODBC connection from Lazarus IDE for inserting german press releases with a lot of Umlauts (äöü). The result in the database was a mess. So how to solute this mess?

The "problematic chain"

First of all we have to consider that different operating systems have different character sets. Usually current Linux distributions have a Unicode character set in contrast to Windows, using CP-1252 for the graphics and codepage 850 for the command shell. Further, Lazarus Applications use different character sets (dependent of a specific compiler option) and at least the Lazarus Code Editor and Form Designer have an own character encoding (coonfigurable).

The solution

For a correct handling of Umlauts a system variable needs to be defined. I will describe this process for Windows 7 and later operating systems. The path variable for the Oracle Instant Client has the name
NLS_LANG
For German Windows installations it should have the value
GERMAN_GERMANY.WE8PC850

"(...) The NLS_LANG option specifies the current language, territory, and database character set, which are determined by session-wide globalization parameters." (Oracle; see http://docs.oracle.com/cd/B28359_01/olap.111/b28126/dml_options070.htm). So we specified the character set of codepage 850 because the Oracle Instant Client is a command line application and uses the command line's character set. Now you can insert and select Umlauts in Instant Client already correctly.
In a further step we will take a closer look at Lazarus IDE. After the NLS_LANG system variable is defined, a Lazarus application will still cause trouble when inserting o querying the Oracle Database. The reason is, once again, Windows. Since Lazarus uses AnsiStrings as String on Windows systems you need to convert these strings before inserting them into Oracle Insert-Statements. Usually you can Use the command
UTF8ToConsole(String)
to modify the input for the INSERT-Query. For the SELECT-Query you contrarily use
ConsoleToUTF8
again. But now there is still one more Lazarus problem since there is the option to create a Win 32 GUI application.

If this option is unchecked, there will always be a console window available beside the application. I use it for debugging and debugger output. But this inconsiderable and unseful windows has a terrible problem: it will cause the compiler the build the application using the console's (command line's) character set by default. Of if this option is unchecked, you need to use
UTF8ToSys
for inserting into a database and
SysToUTF8
for selecting textual data from the database.

Background: Lazarus

Internally Lazarus is using the Qt-Interface to display widgets of the used operating system. Qt itself is using UTF8-encoded textual data for the widgets. Everything in textual from you get from Windows will be system encoded in CP-1252. The Lazarus code Editor and Form Designer again will user UTF8. So when you design the GUI you can use Umlauts without problems. But when you read file names from system yontaining Umlauts you can come into trouble when displaying it in the GUI. SysToUTF8 can convert this.
Now it will look a little bit clearer. But there still is a strange thing: Why calling UTF8ToConsole instead of UTF8ToSys when you have set the compiler option "Win32 gui application"? This is a Windows specific problem. I guess on Linux you only need UTF8ToSys, but did never test it.

Kommentare

Beliebte Posts aus diesem Blog

Pi And More 11 - QMC5883 Magnetic Field Sensor Class

A little aside from the analytical topics of this blog, I also was occupied with a little ubiquitous computing project. It was about machine learning with a magnetic field sensor, the QMC5883. In the Arduino module GY-271, usually the chip HMC5883 is equipped. Unfortunately, in cheap modules from china, another chip is used: the QMC5883. And, as a matter of course, the software library used for the HMC5883 does not work with the QMC version, because the I2C adress and the usage is a little bit different. Another problem to me was, that I  didn't find any proper working source codes for that little magnetic field device, and so I had to debug a source code I found for Arduino at Github  (thanks to dthain ). Unfortunately it didn't work properly at this time, and to change it for the Raspberry Pi into Python. Below you can find the "driver" module for the GY-271 with the QMC5883 chip. Sorry for the bad documentation, but at least it will work on a Raspberry Pi 3. ...

Lazarus IDE and TOracleConnection - A How-To

Free programming IDEs are a great benefit for everybody who's interested in Programming and for little but ambitious companies. One of these free IDEs is the Lazarus IDE . It's a "clone" of the Delphi IDE by Embarcadero (originally by Borland). But actually Lazarus is much more than a clone: Using the Free Pascal-Compiler , it was platform-independent and cross-compiling since it was started. I am using Lazarus very often - especially for building GUIs easily because Java is still Stone-Age when a GUI is required (though there is a couple of GUI-building tools - they all are much less performant than Delphi / Lazarus). In defiance of all benefits of Lazarus there still is one Problem. Not all Components are designed for use on a 64 bit systems. Considering that 64 bit CPUs are common in ordinary PCs since at least 2008, this is very anpleasant. One of the components which will not be available on 64 bit installations is the TOracleConnection of Lazarus' SQLDB ...

How to use TOracleConnection under Lazarus for Win64

Lazarus Programmers have had no possibility to use TOracleConnection under 64 Bit Windows and Lazarus for years. Even if you tried to use the TOracleConnection with a correctly configured Oracle 11g client, you were not able to connect to the Oracle Database. The error message was always: ORA-12154: TNS:could not resolve the connect identifier specified Today I found a simple workaround to fix this problem. It seems like the OCI.DLL from Oracle Client 11g2 is buggy. All my attempts to find identify the error ended here. I could exclude problems with the TNS systems in Oracle - or the Free Pascal file oracleconnection.pp though the error messages suggestes those problems. After investigating the function calls with Process Monitor (Procmon) I found out, that even the file TNSNAMES.ORA was found and read correctly by the Lazarus Test applictaion. So trouble with files not found or wrong Registry keys could also be eliminated. Finally I installed the Oracle Instant Client 12.1c - aft...