I'm using dbplyr to get data from SQL-Server into R, but Chinese, Japanese and other non-Latin characters are appearing as "?". I'm using a windows machine.
I've read through the following threads:
These provide some useful ideas, but nothing has worked so far. I have tried:
Setting
encoding = 'UTF-8'
within thedbConnect
function. Characters still show as question-marks.Setting
encoding = 'UTF-16'
within thedbConnect
function. R returns an error:# Error in iconv(x[current], from = enc, to = to, ...)
Changing the global character encoding to UTF-8 with:
Sys.setenv(LANG = "UTF-8")
andoptions(encoding = "UTF-8")
Checking if the characters display when plotting (which would indicate that they are being stored correctly). This wasn't the case.
I was able to get the characters to display correctly by using RJDBC, however this is not compatible with dbplyr, according to this GitHub issue.
Here is my session info:
> sessionInfo()
# R version 3.5.0 (2018-04-23)
# Platform: x86_64-w64-mingw32/x64 (64-bit)
# Running under: Windows >= 8 x64 (build 9200)
# Matrix products: default
# locale:
# [1] LC_COLLATE=English_United Kingdom.1252 LC_CTYPE=English_United Kingdom.1252 LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C
# [5] LC_TIME=English_United Kingdom.1252
My code looks like this:
> con <- dbConnect(odbc(),
Driver = "SQL Server",
Server = "server name",
Database = "database name",
user = "my username",
password = "my password",
encoding = "UTF-8")
odbc/dbplyr sure handles these character types on Windows, so what am I missing here?
Any help would be much appreciated!