I am using the help of https://ipstack.com to geocode IP addresses and am having a difficult time trying to geocode all 1200 addresses in a short amount of time.
With R, I've collected the URLs into a list (e.g. http://api.ipstack.com/[IP address]?access_key=[access key]
) and can use read_json
to read the json data of each URL. But I've not been able to develop a loop to extract the data from each URL.
library(RCurl)
library(jsonlite)
x <- c("http://api.ipstack.com/178.140.119.217?access_key=[access_key]", "http://api.ipstack.com/68.37.21.125?access_key=[access_key]", "http://api.ipstack.com/68.10.255.89?access_key=[access_key]")
read_json(x)
Error in file(path) : invalid 'description' argument
I'm looking for a solution that will be able to read multiple IP addresses and then attach the information to a dataframe.
*Edit 1: Still stuck, but I'm making some progress with the loop,
library(RCurl)
library(jsonlite)
url_lst = as.character(df$URL)
output = NULL
for (i in url_lst) {
x = as.data.frame(read_json(i))
output = rbind(output,x)
}
However, this results in an error:
Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE, : arguments imply differing number of rows: 1, 0
As well, the code only produces 8 observations rather than 1200.
*Edit 2: Bill Ash's answer got me further than I was, but it looks like some values in the JSON data aren't allowing the code to be successful.
Bill Ash's code:
library(httr)
library(tibble)
library(purrr)
library(jsonlite)
ip_addresses <- core_members$ip_address
# a simple function
ip_locate <- function(your_vector_of_ip_addresses, access_key) {
ip <- your_vector_of_ip_addresses
map_df(ip, ~{
out <- httr::GET(url = paste0("http://api.ipstack.com/", .,
"?access_key=", access_key))
resp <- fromJSON(httr::content(out, "text"), flatten = TRUE)
tibble::tibble(ip = resp$ip,
country = resp$country_name,
region = resp$region_name,
city = resp$city,
zip = resp$zip,
lat = resp$latitude,
lng = resp$longitude)
})
}
ip_info <- ip_locate(your_vector_of_ip_addresses = ip_addresses,
access_key = "[access_key]")
# output
ip_info %>%
head()
Where the error begins
ip_info <- ip_locate(your_vector_of_ip_addresses = ip_addresses,
access_key = "[access_key]")
Error: All columns in a tibble must be 1d or 2d objects:
* Column `zip` is NULL
9.
stop(cnd)
8.
abort(error_column_must_be_vector(names_x[is_xd], classes))
7.
check_valid_cols(x)
6.
lst_to_tibble(xlq$output, .rows, .name_repair, lengths = xlq$lengths)
5.
tibble::tibble(ip = resp$ip, country = resp$country_name, region = resp$region_name,
city = resp$city, zip = resp$zip, lat = resp$latitude, lng = resp$longitude)
4.
.f(.x[[i]], ...)
3.
map(.x, .f, ...)
2.
map_df(ip, ~{
out <- httr::GET(url = paste0("http://api.ipstack.com/",
., "?access_key=", access_key))
resp <- fromJSON(httr::content(out, "text"), flatten = TRUE) ...
1.
ip_locate(your_vector_of_ip_addresses = ip_addresses, access_key = "[access_key]")
Because I only need the coordinates from these IP addresses, I believe this has been resolved. Hopefully, someone is be willing to continue advising on this issue, but I won't be updating this any further.