Direkt zum Hauptbereich

The dark Side of Data Analytics

Meanwhile I am working as a data analyst for an industry company. One part of my job is to research production processes and reducing defects on the products to avoid them. Unfortunately during analysis with decision trees and logistic regression models a correlation very often is given, though a causality in many cases doesn't exist. And even if a causality exists, I have to consider the direction to avoid mistaking cause and impact. All these considerations are known under the term "Business Understanding". If you don't have the business understanding for your job, data mining and more simply analytics can get misleading very fast.

Business Understanding an its spread

Nearly every company that uses analytics and data mining has a strong concern in using the insights proper, because a wrong used information can cause bad consequences. This issue is that present in the heads of many managers, that they attach a lot of importance to business understanding.

The german novel author Elias Bohst wrote an eBook ("Jagd auf Agent Albatros", in English like "Hunting for Agent Albatroaz") about a spy who wants to abscond from his job. In chapter 11 he describes a situation that shows the need of a proper business understandig by telling a story about a cat.
This cat is called Mauzi (in English spoken like "Mowtsee") and when his owner arrives at home he finds Mauzi sitting at the sill of the door near a dead bird. The human implicitly estimates the situation in a short moment and decides, that his cat has slayed the bird. Of course he is not amused and shouts at poor Mauzi, though he is not guilty. However, the estimation seems to be the closest conclusion about what happened. What the owner of Mauzi did not see were little details. Details like a spot of tallow on the glass pane of the door the bird left when he flew towards it and broke his neck. And just after this, Mauzi arrived and sat beneeth the cadaver before his owner arrived.

This story shows one big problem very well: Even the closest conclusion is not automatically the proper one. This is one of my daily problems. Very often I have to dicuss with process owners and the development department about correlations, causalities and nonsense correlations. As we found out, the closest conclusion somties just fails badly.

About the book

I've read the eBook "Jagd auf Agent Albatros" myself and I like it. The author gave me the permission to use parts of it for this post, but a condition was that I set a link to amazon.com.

Kommentare

Beliebte Posts aus diesem Blog

Pi And More 11 - QMC5883 Magnetic Field Sensor Class

A little aside from the analytical topics of this blog, I also was occupied with a little ubiquitous computing project. It was about machine learning with a magnetic field sensor, the QMC5883. In the Arduino module GY-271, usually the chip HMC5883 is equipped. Unfortunately, in cheap modules from china, another chip is used: the QMC5883. And, as a matter of course, the software library used for the HMC5883 does not work with the QMC version, because the I2C adress and the usage is a little bit different. Another problem to me was, that I  didn't find any proper working source codes for that little magnetic field device, and so I had to debug a source code I found for Arduino at Github  (thanks to dthain ). Unfortunately it didn't work properly at this time, and to change it for the Raspberry Pi into Python. Below you can find the "driver" module for the GY-271 with the QMC5883 chip. Sorry for the bad documentation, but at least it will work on a Raspberry Pi 3.

How to use TOracleConnection under Lazarus for Win64

Lazarus Programmers have had no possibility to use TOracleConnection under 64 Bit Windows and Lazarus for years. Even if you tried to use the TOracleConnection with a correctly configured Oracle 11g client, you were not able to connect to the Oracle Database. The error message was always: ORA-12154: TNS:could not resolve the connect identifier specified Today I found a simple workaround to fix this problem. It seems like the OCI.DLL from Oracle Client 11g2 is buggy. All my attempts to find identify the error ended here. I could exclude problems with the TNS systems in Oracle - or the Free Pascal file oracleconnection.pp though the error messages suggestes those problems. After investigating the function calls with Process Monitor (Procmon) I found out, that even the file TNSNAMES.ORA was found and read correctly by the Lazarus Test applictaion. So trouble with files not found or wrong Registry keys could also be eliminated. Finally I installed the Oracle Instant Client 12.1c - aft

Lazarus IDE and TOracleConnection - A How-To

Free programming IDEs are a great benefit for everybody who's interested in Programming and for little but ambitious companies. One of these free IDEs is the Lazarus IDE . It's a "clone" of the Delphi IDE by Embarcadero (originally by Borland). But actually Lazarus is much more than a clone: Using the Free Pascal-Compiler , it was platform-independent and cross-compiling since it was started. I am using Lazarus very often - especially for building GUIs easily because Java is still Stone-Age when a GUI is required (though there is a couple of GUI-building tools - they all are much less performant than Delphi / Lazarus). In defiance of all benefits of Lazarus there still is one Problem. Not all Components are designed for use on a 64 bit systems. Considering that 64 bit CPUs are common in ordinary PCs since at least 2008, this is very anpleasant. One of the components which will not be available on 64 bit installations is the TOracleConnection of Lazarus' SQLDB