Quantcast
Channel: Active questions tagged r - Stack Overflow
Viewing all articles
Browse latest Browse all 201945

MariaDB ODBC driver for R gives strange results

$
0
0

There is a strange behaviour when importing data from MariaDB into R when using MariaDB ODBC 3.1.3 Driver. The table in DB I want to import is:

+----------------------------+---------+-----+------+
|          analyse           | einheit | MIN | max  |
+----------------------------+---------+-----+------+
| ALAT/GPT                   | U/L     | \N  | 35   |
| ALAT/GPT                   | U/L     | 1   | 5000 |
| Albumin                    | g/L     | 35  | 55   |
| Albumin                    | g/L     | 1   | 200  |
| Albumin im Urin            | mg/L    | 0   | 20   |
| Albumin im Urin            | mg/L    | 0   | \N   |
| Albumin/Kreatinin-Quotient | mg/g    | 0   | 30   |
| Albumin/Kreatinin-Quotient | mg/g    | 0   | 7000 |
| Alkalische Phosphatase     | U/L     | 36  | 92   |
| Alkalische Phosphatase     | U/L     | 1   | 3000 |
+----------------------------+---------+-----+------+

If I this into R using MariaDB ODBC 3.1.3 Driver I get this in R:

+------+------------------------------+----------+------+-------+
|  row |                    analyse   |  einheit |  min |   max |
+------+------------------------------+----------+------+-------+
|    1 |                     ALAT/GPT |      U/L |   NA |    35 |
|    2 |                     ALAT/GPT |      U/L |    0 |     0 |  <-
|    3 |                      Albumin |      g/L |    1 |  5000 |
|    4 |                      Albumin |      g/L |    0 |     0 |  <-
|    5 |              Albumin im Urin |     mg/L |   35 |    55 |
|    6 |              Albumin im Urin |     mg/L |    0 |    NA |
|    7 |   Albumin/Kreatinin-Quotient |     mg/g |    1 |   200 |
|    8 |   Albumin/Kreatinin-Quotient |     mg/g |    0 |     0 |  <-
|    9 |       Alkalische Phosphatase |      U/L |    0 |    20 |
|   10 |      Alkalische Phosphatase  |      U/L |    0 |     0 |  <-
+------+------------------------------+----------+------+-------+

Obviously, in columns min and max couple (0,0) were inserted into rows 2,4,8 and 10 shifting the values of these columns downwards. This happens only with this specific table. If I import other tables from DB to R I got the correct tables.

If I use the MySQL ODBC 8.0 unicode driver it works correctly also with this specific table. Is there a reason for this behaviour and is it possible to avoid it when using this driver?

Environment:

  • Windows 10
  • R version 3.6.1 (2019-07-05))
  • RODBC 1.3-16
  • MariaDB 10.4.8
  • MariaDB ODBC 3.1 Driver

Update 1: New insights using MariaODBC (Version 3.1.6)

The follow script does compare maria-odbc and mysql-odbc.

library("RODBC")

odbcDataSources()

## -- Two connections
##    ma: maria-odbc
##    my: mysql-odbc
conn_ma <- odbcConnect("tabl_curr_add_mdo")
conn_my <- odbcConnect("tabl_curr_add")

odbcGetInfo(conn_ma)
odbcGetInfo(conn_my)

head(sqlTables(conn_ma),2)
## +--------------+---------------------+
## |  TABLE_SCHEM |     TABLE_NAME      |
## +--------------+---------------------+
## |  codes       |  ci5v09_ch3_00      |
## |  codes       |  ci5v09_ch3_01      |
## +--------------+---------------------+

head(sqlTables(conn_my),2)
## +---------------+----------------+
## | TABLE_SCHEM   |   TABLE_NAME   |
## +---------------+----------------+
## | tabl_curr_add |    arzt        |
## | tabl_curr_add |    arzt_ctrl_1 |
## +---------------+----------------+

length(unique(sqlTables(conn_ma)$TABLE_CAT))
## 35 databases
length(unique(sqlTables(conn_my)$TABLE_CAT))
## 1 database

Conclusion 1: maria-odbc does connect to all database of the MariaDB instance.

lim_ma <- sqlQuery(conn_ma, "SELECT analyse, min, max FROM tabl_curr_add.labor_limits")
head(lim_ma)

## +--------------------+------+-------+
## |      nalyse        |  min |  max  |
## +--------------------+------+-------+
## | ALAT/GPT           |   NA |    35 |
## | ALAT/GPT           |    0 |     0 |
## | Albumin            |    1 |  5000 |
## | Albumin            |    0 |     0 |
## | Albumin im Urin    |   35 |    55 |
## | Albumin im Urin    |    0 |    NA |
## +--------------------+------+-------+

lim_my <- sqlQuery(conn_my, "SELECT analyse, min, max FROM tabl_curr_add.labor_limits")
head(lim_my)

## +--------------------+------+-------+
## |      analyse       |  min |   max |
## +--------------------+------+-------+
## | ALAT/GPT           |  NA  |    35 |
## | ALAT/GPT           |   1  |  5000 |
## | Albumin            |  35  |    55 |
## | Albumin            |   1  |   200 |
## | Albumin im Urin    |   0  |    20 |
## | Albumin im Urin    |   0  |    NA |
## +--------------------+------+-------+

Conclusion 2: maria-odbc does corrupt data when imported into R.


Viewing all articles
Browse latest Browse all 201945

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>