Create an interactive map of the live music venues in your city with the Songkick API
A tutorial - including replicable code - for using R and the Songkick API to discover more about the live music venues in your city
 
    
  I am currently working on research project that is looking at live music in my home city of Birmingham, UK. As part of that work I’ve been exploring the API of Songkick in an attempt to generate an initial map of the music venues in the city.
Songkick is a service that provides discovery and ticket sales for live music events worldwide. Through their website and mobile app users can track touring artists, receive alerts for concerts in their area, and purchase tickets to shows. Their API provides access to data for over 6 million concerts. My aim with exploring their API was to see what information could be gathered that might help us begin to understand the landscape of live music in Birmingham.
Over the last week - and following quite a bit of trial and error! - I have managed to create a workflow using R that pulls data from that API and creates interactive map of music venues. Before starting that process, I had looked around online to see if anyone else had tried something similar (and - I hoped - had then been inclined to create a walkthrough tutorial). Since I was unable to find much at all around Songkick and R, I have created a walkthrough tutorial of my own.
By following the steps outlined below you should be able to make a similar map for a city of your choice - either be your own home city, or a city you are intending to visit soon. This post will help you create a map plotting all venues for a given city present in the Songkick API. The map will also provide pop-up information for each venue, including venue capacity, website address, and so on.
All the code required to create your map is provided below, along with explanations of what the various elements do. The code was working correctly as of 06/02/20, but please do let me know if you encounter errors. I can be contacted via email at craig.hamilton@bcu.ac.uk.
The workflow moves through 6 phases, each of which opens with a brief description of the process involved. At the end of the post I offer some thoughts on the process and about the data I have gathered, along with some suggestions about where I would like to take this process.
1. Getting started
This section includes links to the latest versions of R and R Studio, details on how to obtain your API key from Songkick, and a few basic lines of code that will help you when running the script.
Install R and R Studio, and obtain an API key
Before attempting to run the code contained within this post you will need the following:
- The latest installation of R - visit The R Project website for more information and download links.
- R Studio - visit the R Studio website for more information and download links.
- A Songkick API key - visit Songkick’s site for more information and to complete the online form for requesting a key.
R and R Studio are available as free, open source downloads and there are a huge number of resources online that will help you get started. The links above will explain those resources and the installation processes much better than I would be able to.
A non-commerical Songkick API key is also free, but there are certain terms of use that you should be aware of before applying for and using your API key. You can click here to read those terms and conditions in full. Songkick aim to respond to requests for API keys within 7 days. Without an API key you will not be able to proceed with this tutorial.
Load required packages
Once you have R, R Studio and an API key you will need to start a new R script and install a number of packages. You will also need some other packages later in the script, but I will introduce those as we go along.
#install.packages('httr')
#install.packages('jsonlite')
#install.packages('tidyverse')
#install.packages('magrittr')
#install.packages('varhandle')
library(httr)
library(jsonlite)
library(tidyverse)
library(magrittr)
library(varhandle)Set API key
After applying for an API key from Songkick you will receive an email with a 16-character string. This is your unique key and you will need to include it in every call you make to the API.
You should not publish your API key, or share it with other users. For the purpose of creating this tutorial I have hidden my key, but to enter your own you can simply amend the code below, replacing “XXXXXXXXXX” with your own key. Once your API key is held within an object in the R environment you can begin building your first call.
api_key <- "XXXXXXXXXXXX"Create this function - it will become important!
The function below - which comes from the excellent work of Colin Fay - will be crucial when it comes to organising the data from the Songkick API. I will explain in more detail when we come to use it in a short while.
`%||%` <- function(a,b) if(is.null(a)) b else aDownload Songkick’s attribution images
Part of the terms and conditions of use for Songkick’s API requires that you include one of their approved logos. You can download a ZIP file containing Songkick attribution images here.
Once you have the folder of images downloaded, select the image you would like to use and then copy and paste it into the folder for your R session.
2. Find ‘Metro Area Code’ and location data for your city
In order to get details for venues within a given city you will first need to find out which ‘metro area’ Songkick places that city within, along with the 5-digit metro_id code associated with that area. For mapping purposes, you will also need the latitude and longitude coordinates of that city. This section will show you how to find that information using the Songkick API. You will need the metro_id code to complete Step 3, and the latitude and longitude coordinates in Step 6.
You can find the metro code for the city of your choice by visiting the Songkick website and searching for a city. When you visit the page for a given city the 5-digit code will be visible in the URL. You can also find the latitude and longitude data for a city by visiting a website such as LatLong.Net and searching for your chosen city. In both cases, simply copy the required data and assign them respectively to the city_id, city_lat and city_lon objects introduced later in this step.
However, if you intend to make more than one map you can save some time by calling the API for the same information. You can visit the Location Search section of the API pages for more detail on this.
The manner in which we will retrieve the metro_id and location data from the API is similar to how we will retrieve the event and venue data later in the script, so it is worth briefly explaining how this works.
The location API call is structured as follows:
“https://api.songkick.com/api/3.0/search/locations.json?query={search_query}&apikey={your_api_key}”
To create your call the elements within curly brackets above need to be replaced with your search query and api_key. We can build this call and make it replicable by breaking down the elements required into chunks and then reassembling them - all you will then need do to create a new map is change the contents of your search query object. In the example below my search term is ‘Birmingham’, but you can replace the contents of the search_city object below with your own choice to create your map.
First, we can break down the call into the following chunks:
- metro_call = the first part of the URL above (everything to the left of {search_query})
- search_city = replaces {search_query} element with your search term.
- metro_call_2 = is everything to the right of {search_query} and to the left of {your_api_key} in the URL above
- api_key = your unique api_key created earlier.
Then, the paste() function then puts all of those elements back together, with the sep="" argument making sure there are no spaces between the elements. This creates the call and stores it in the metro_call object.
search_city <- "Birmingham"
metro_call <- "https://api.songkick.com/api/3.0/search/locations.json?query="
metro_call_2 <- "&apikey="
metro_call <- paste(metro_call, search_city, metro_call_2, api_key, sep="")You can now use metro_call object with the GET function to call the API, and then use the fromJSON function to make the data more easily readable in the R environment. Then, by extracting data from the get_metro_info_json object, we can create a dataframe of the results.
Note how in the code element below we are using Colin Fay’s %||% function created in Step 1. This helps deal with any empty end points in the API. Because we are ultimately assembling a data frame from the data we collect, each element must be of the same dimensions - in simple terms, you cannot add a row with 6 columns to a dataframe consisting of only 5. What Colin’s function does is create an NA value when an element of the API data is empty or NULL, and this means that every record is of the same dimension.
get_metro_info <- GET(metro_call)
get_metro_info_text <- content(get_metro_info, "text")
get_metro_info_json <- fromJSON(get_metro_info_text, flatten = TRUE)
metro_id <- get_metro_info_json$resultsPage$results$location$metroArea.id %||% NA
metro_area_name <- get_metro_info_json$resultsPage$results$location$metroArea.displayName %||% NA
metro_country <- get_metro_info_json$resultsPage$results$location$metroArea.country.displayName %||% NA
metro_city_name <- get_metro_info_json$resultsPage$results$location$city.displayName %||% NA
metro_country_name <- get_metro_info_json$resultsPage$results$location$city.country.displayName %||% NA
metro_state_name <- get_metro_info_json$resultsPage$results$location$city.state.displayName %||% NA
metro_city_lat <- get_metro_info_json$resultsPage$results$location$city.lat %||% NA
metro_city_lon <- get_metro_info_json$resultsPage$results$location$city.lng %||% NA
metro_df <- as.data.frame(cbind(metro_id, metro_area_name, metro_country, metro_city_name, metro_country_name, metro_state_name, metro_city_lat, metro_city_lon))
rm(metro_id, metro_area_name, metro_country, metro_city_name, metro_country_name, metro_state_name,
   metro_city_lat, metro_city_lon)When we look at the results of the metro_df dataframe, however, we can see a problem. The API has returned results for 5 cities. This is because there are several cities in the United States also called Birmingham.
(Note how the %||% function has produced an NA value in the ‘state’ column for Birmingham, UK, even though that information is not present for UK cities. This demonstrates how the function enables us to assemble a dataframe)
metro_df##   metro_id metro_area_name metro_country metro_city_name
## 1    18073         Detroit            US      Birmingham
## 2    24542      Birmingham            UK      Birmingham
## 3     8474      Birmingham            US      Birmingham
## 4   103051      Birmingham            US      Birmingham
## 5    78641      Birmingham            US      Birmingham
##   metro_country_name metro_state_name metro_city_lat metro_city_lon
## 1                 US               MI           <NA>           <NA>
## 2                 UK             <NA>         52.478         -1.907
## 3                 US               AL        33.5248       -86.8127
## 4                 US               PA      39.900822     -75.601448
## 5                 US               IL      40.264851     -90.821747However, thanks to the other information we have also pulled from the API we can see fairly clearly which record we want. By subsetting the data on the metro_id variable corresponding to the UK city of Birmingham we can create a new dataframe called metro that gives us the metro code, latitude and longitude information we need.
From there we can create three new objects that can be used in later steps: city_id, city_lat and city_lon.
metro <- metro_df %>%
  filter(metro_id == "24542")
metro##   metro_id metro_area_name metro_country metro_city_name
## 1    24542      Birmingham            UK      Birmingham
##   metro_country_name metro_state_name metro_city_lat metro_city_lon
## 1                 UK             <NA>         52.478         -1.907city_id <- unfactor(metro$metro_id)
city_lat <- unfactor(metro$metro_city_lat)
city_lon <- unfactor(metro$metro_city_lon)
city_id## [1] 24542city_lat## [1] 52.478city_lon## [1] -1.9073. Get all up-coming concerts for your city
From Step 2 we now have the metro_id for Birmingham, UK (or the city you chose) and the next step to finding venue information is using the metro_id information to create another new dataframe - one that contains all events scheduled for a city. This dataframe will contain the crucial venue_id information, which we will use in Step 4.
To my knowledge there is no direct way of finding a list of venues for a given ‘metro area’ from the Songkick API. My workaround has been instead to pull in all available events (concerts, festivals) for a given ‘metro area’, and from there extract the venue_id codes for the venue hosting each event.
The Songkick API will allow you to pull in details for all events taking place over the three months following the date you run this script. Providing that a given music venue is present in the Songkick database (which may not always be the case), it is nevertheless a reasonable assumption that an active music venue will be staging an event at some point over the next 100 or so days. As such this method should provide a decent picture of active music venues in your city.
We can build this dataframe with an API call constructed in a similar manner to the previous step, but with the addition of a few important lines of code. As you can see below, the API call for metro area events is structured in a similar way to the API call for location information used in the previous step:
“https://api.songkick.com/api/3.0/metro_areas/{metro_area_id}/calendar.json?apikey={your_api_key}”
In this instance we need to create a call that replaces two elements. The good news is that we already have the information we need to populate those elements:
- {metro_area_id} - which we now have in the city_id object.
- {your_api_key} - which you should now have stored in your api_key object.
calendar_base <- "https://api.songkick.com/api/3.0/metro_areas/"
metro_code <- city_id
calendar_end <- "/calendar.json?apikey="
calendar_call <- paste(calendar_base, metro_code, calendar_end, api_key, sep="")For the purposes of creating the map we only really need the $venue.id information, but I am collecting other elements for illustrative purposes as you may wish to use event information in other ways.
For more detail on all data available from this part of the Songkick API visit the Metro Area’s Upcoming Events page, where you will also find information on additional parameters you can add to the call. We will be using one such additional parameter - page - in one of the next steps.
As above, we first create a new dataframe (event_df) by gathering information from the get_calendar_info_json object the calendar_call object provided us with.
get_calendar_info <- GET(calendar_call)
get_calendar_info_text <- content(get_calendar_info, "text")
get_calendar_info_json <- fromJSON(get_calendar_info_text, flatten = TRUE)
event_id <- get_calendar_info_json$resultsPage$results$event$id %||% NA
event_type <- get_calendar_info_json$resultsPage$results$event$type %||% NA 
event_uri <- get_calendar_info_json$resultsPage$results$event$uri %||% NA
event_display_name <- get_calendar_info_json$resultsPage$results$event$displayName %||% NA
event_date <- get_calendar_info_json$resultsPage$results$event$start.date %||% NA
event_time <- get_calendar_info_json$resultsPage$results$event$start.time %||% NA
event_venue_id <- get_calendar_info_json$resultsPage$results$event$venue.id %||% NA
event_venue_name <- get_calendar_info_json$resultsPage$results$event$venue.displayName %||% NA
event_df <- as.data.frame(cbind(event_id, event_type, event_uri, event_display_name, event_date,
                                event_time, event_venue_id, event_venue_name))
nrow(event_df)## [1] 50We can see above that this initial result contains only 50 records, which does not sound like many events for a 3 month period. The reason for this low number is that by default the API returns only the first page of results and has a maximum setting of 50 results per page. We need to find out details for the missing events.
To do that we first need to find out how many results there are in total by looking at the $totalEntries element of the get_calendar_info_json object. Then, we divide that figure by 50 (the default and maximum per page) before rounding it up to the next whole number. This will tell us how many pages of results are available, which will form the basis of a for loop that will extract data from all pages.
total_events <- get_calendar_info_json$resultsPage$totalEntries 
pages <- total_events/50 
pages## [1] 17.28pages <- ceiling(pages) 
pages ## [1] 18We now know that we have to loop through 18 pages of results to get all events in the calendar. We already have the first page of results, so we can begin on page 2 and then loop through the remaining pages up to page 18.
The code below is exactly the same as that which we used to gather the data from page one, except for the addition on “&page=” and i. The for loop runs the code first with 2 in place of i, then with 3, and 4, and so on until it matches the value in the pages object, which we know is 18.
NOTE: The final line of the code below reads as follows: “#Sys.sleep(time = 5) - use this if pages variable is very large”. Because the line begins with an # symbol R will execute it. I have included it here in case your own pages variable is large (this will likely be the case if you choose a very large city, or one with lots of events), because a high value in the pages variable may mean you exceed the rate limits of the Songkick API. The line of code provided simply puts a small pause (5 seconds) in the process before the next iteration of the loop begins. If you encounter issues with rate limits try removing the # symbol and running the loop again. You may also need to increase the pause value.
for(i in 2:pages){
  calendar_call_2 <- paste(calendar_base, metro_code, calendar_end, api_key,"&page=", i, sep="")
  get_calendar2 <- GET(calendar_call_2)
  get_calendar2_text <- content(get_calendar2, "text")
  get_calendar2_json <- fromJSON(get_calendar2_text, flatten = TRUE)
  event_id <- get_calendar2_json$resultsPage$results$event$id %||% NA
  event_type <- get_calendar2_json$resultsPage$results$event$type %||% NA 
  event_uri <- get_calendar2_json$resultsPage$results$event$uri %||% NA
  event_display_name <- get_calendar2_json$resultsPage$results$event$displayName %||% NA
  event_date <- get_calendar2_json$resultsPage$results$event$start.date %||% NA
  event_time <- get_calendar2_json$resultsPage$results$event$start.time %||% NA
  event_venue_id <- get_calendar2_json$resultsPage$results$event$venue.id %||% NA
  event_venue_name <- get_calendar2_json$resultsPage$results$event$venue.displayName %||% NA
  event_df_2 <- as.data.frame(cbind(event_id, event_type, event_uri, event_display_name, event_date,
                                    event_time, event_venue_id, event_venue_name))
  event_df <- rbind(event_df, event_df_2)
  #Sys.sleep(time = 5) - use this if pages variable is very large
}Assuming the above loop runs without encountering issues you will now have a dataframe of 864 events and can begin the process of extract the individual venue_id infomation. Before you do that, it may be useful to remove any empty or NA values.
event_df %>% count(event_venue_name)## # A tibble: 125 x 2
##    event_venue_name        n
##    <fct>               <int>
##  1 Acapella                1
##  2 Bear Tavern             1
##  3 Bingley Hall            1
##  4 Birmingham the Mill     2
##  5 Blue Monkey Club        1
##  6 Castle & Falcon        28
##  7 Centrala                3
##  8 Dark Horse              3
##  9 Dead Wax Digbeth       31
## 10 Hare & Hounds          60
## # … with 115 more rowsvenue_ids <- as.data.frame(unique(event_df$event_venue_id))
nrow(venue_ids)## [1] 126venue_ids <- na.omit(venue_ids)
nrow(venue_ids)## [1] 125venue_ids <- venue_ids %>%
  filter(`unique(event_df$event_venue_id)` != "")
nrow(venue_ids)## [1] 1254. Get Venue information for your city
From the dataset of events you create in Step 3 you will be able to extract the unique venue_id codes for your chosen city. Using those codes this step will show you how to call the Songkick API and retrieve information on each venue. This information will include things such as the venue address, phone number, website, capacity, and - crucially for the map we will create in Step 6 - the latitude and longitude co-ordinates.
As with the API calls in steps 2 and 3 we can now use the venue_id data we have collected to construct a new call that will query the API for venue data. To read about what information is available from the Venue Details element of API and to explore additional parameters visit the relevant help page.
The process here mirrors exactly the process from Step 3.
#Create API call 
venue_url1 <- "https://api.songkick.com/api/3.0/venues/"
venue <- venue_ids[1,]
venue_url2 <- ".json?apikey="
venue_call <- paste(venue_url1, venue, venue_url2, api_key, sep="")
#Collect data
venue_info <- GET(venue_call)
venue_info_text <- content(venue_info, "text")
venue_info_json <- fromJSON(venue_info_text, flatten = TRUE)
#Extract info for venue 1 and create dataframe
venue_songkick_id <- venue_info_json$resultsPage$results$venue$id %||% NA
venue_name <- venue_info_json$resultsPage$results$venue$displayName %||% NA
venue_songkick_city_name <- venue_info_json$resultsPage$results$venue$city$displayName %||% NA 
venue_songkick_city_id <- venue_info_json$resultsPage$results$venue$city$id %||% NA 
venue_songkick_country <- venue_info_json$resultsPage$results$venue$city$country$displayName %||% NA
venue_songkick_uri <- venue_info_json$resultsPage$results$venue$uri %||% NA
venue_address <- venue_info_json$resultsPage$results$venue$street %||% NA 
venue_address_two <- venue_info_json$resultsPage$results$venue$city$displayName %||% NA 
venue_p_code <- venue_info_json$resultsPage$results$venue$zip %||% NA
venue_lat <- venue_info_json$resultsPage$results$venue$lat %||% NA
venue_lon <- venue_info_json$resultsPage$results$venue$lng %||% NA
venue_phone <- venue_info_json$resultsPage$results$venue$phone %||% NA
venue_website <- venue_info_json$resultsPage$results$venue$website %||% NA
venue_capacity <- venue_info_json$resultsPage$results$venue$capacity %||% NA
venue_description <- venue_info_json$resultsPage$results$venue$description %||% NA
venue_df <- as.data.frame(cbind(venue_songkick_id, venue_name, venue_songkick_city_name, venue_songkick_city_id,
                                venue_songkick_country, venue_songkick_uri, venue_address, venue_address_two,
                                venue_p_code, venue_lat, venue_lon, venue_phone, venue_website, venue_capacity,
                                venue_description))
#Extract info for remaining venues and add to dataframe
for(i in 2:nrow(venue_ids)){
  venue_url2 <- "https://api.songkick.com/api/3.0/venues/"
  venue <- venue_ids[i, ]
  venue_url2 <- ".json?apikey="
  venue_call2 <- paste(venue_url1, venue, venue_url2, api_key, sep="")
  venue_info2 <- GET(venue_call2)
  venue_info2_text <- content(venue_info2, "text")
  venue_info2_json <- fromJSON(venue_info2_text, flatten = TRUE)
  venue_songkick_id <- venue_info2_json$resultsPage$results$venue$id %||% NA
  venue_name <- venue_info2_json$resultsPage$results$venue$displayName %||% NA
  venue_songkick_city_name <- venue_info2_json$resultsPage$results$venue$city$displayName %||% NA 
  venue_songkick_city_id <- venue_info2_json$resultsPage$results$venue$city$id %||% NA 
  venue_songkick_country <- venue_info2_json$resultsPage$results$venue$city$country$displayName %||% NA
  venue_songkick_uri <- venue_info2_json$resultsPage$results$venue$uri %||% NA
  venue_address <- venue_info2_json$resultsPage$results$venue$street %||% NA 
  venue_address_two <- venue_info2_json$resultsPage$results$venue$city$displayName %||% NA 
  venue_p_code <- venue_info2_json$resultsPage$results$venue$zip %||% NA
  venue_lat <- venue_info2_json$resultsPage$results$venue$lat %||% NA
  venue_lon <- venue_info2_json$resultsPage$results$venue$lng %||% NA
  venue_phone <- venue_info2_json$resultsPage$results$venue$phone %||% NA
  venue_website <- venue_info2_json$resultsPage$results$venue$website %||% NA
  venue_capacity <- venue_info2_json$resultsPage$results$venue$capacity %||% NA
  venue_description <- venue_info2_json$resultsPage$results$venue$description %||% NA
  venue_df_2 <- as.data.frame(cbind(venue_songkick_id, venue_name, venue_songkick_city_name, venue_songkick_city_id,
                                    venue_songkick_country, venue_songkick_uri, venue_address, venue_address_two,
                                    venue_p_code, venue_lat, venue_lon, venue_phone, venue_website, venue_capacity,
                                    venue_description))
  venue_df <- rbind(venue_df, venue_df_2)
  rm(venue_df_2, venue_songkick_id, venue_name, venue_songkick_city_name, venue_songkick_city_id,
     venue_songkick_country, venue_songkick_uri, venue_address, venue_address_two,
     venue_p_code, venue_lat, venue_lon, venue_phone, venue_website, venue_capacity,
     venue_description)
  #Sys.sleep(time = 5) - use this if pages variable is very large
}We now have a dataframe containing 125 records, each of which contains information about an individual music venue in the city, including capacity, address, website, and so on. This dataframe contains everything we need to build our map.
head(venue_df)##   venue_songkick_id              venue_name venue_songkick_city_name
## 1             54420            Bingley Hall               Birmingham
## 2             18229           Hare & Hounds               Birmingham
## 3             90238 O2 Institute Birmingham               Birmingham
## 4            501851 The Kitchen Garden Cafe               Birmingham
## 5           4047349     Birmingham the Mill               Birmingham
## 6           3851774       Joe Joe Jim's Bar               Birmingham
##   venue_songkick_city_id venue_songkick_country
## 1                  24542                     UK
## 2                  24542                     UK
## 3                  24542                     UK
## 4                  24542                     UK
## 5                  24542                     UK
## 6                  24542                     UK
##                                                                                 venue_songkick_uri
## 1            http://www.songkick.com/venues/54420-bingley-hall?utm_source=59524&utm_medium=partner
## 2         http://www.songkick.com/venues/18229-hare-and-hounds?utm_source=59524&utm_medium=partner
## 3 http://www.songkick.com/venues/90238-o2-institute-birmingham?utm_source=59524&utm_medium=partner
## 4    http://www.songkick.com/venues/501851-kitchen-garden-cafe?utm_source=59524&utm_medium=partner
## 5   http://www.songkick.com/venues/4047349-birmingham-the-mill?utm_source=59524&utm_medium=partner
## 6      http://www.songkick.com/venues/3851774-joe-joe-jims-bar?utm_source=59524&utm_medium=partner
##              venue_address venue_address_two venue_p_code  venue_lat
## 1                                 Birmingham      B18 5BE 52.4921752
## 2 High Street, Kings Heath        Birmingham      B14 7JZ   52.43598
## 3   78 Digbeth High Street        Birmingham        B56DY   52.47556
## 4             17 York Road        Birmingham      B14 7SA  52.434979
## 5                                 Birmingham       B9 4AG   52.47517
## 6              Lickey Road        Birmingham      B45 8UU 52.3832268
##    venue_lon   venue_phone                             venue_website
## 1 -1.9286691          <NA>                                      <NA>
## 2   -1.89266 0121 444 2081 http://www.hareandhoundskingsheath.co.uk/
## 3   -1.88745  0121 6430428       http://o2institutebirmingham.co.uk/
## 4  -1.893854 0121 443 4725            http://kitchengardencafe.co.uk
## 5   -1.88185          <NA>                                      <NA>
## 6 -2.0022114          <NA>              http://www.joejoejims.co.uk/
##   venue_capacity
## 1           <NA>
## 2            600
## 3           1500
## 4           <NA>
## 5           <NA>
## 6           <NA>
##                                                                             venue_description
## 1                                                                                            
## 2                                                                                            
## 3 Birmingham's most exciting and lively venue. Home to fantastic clubnights and awesome gigs!
## 4                                                                                            
## 5                                                                                            
## 65. Housekeeping
IMPORTANT: Before proceeding to the final stage and the creation of our map we need to perform two simple housekeeping tasks.
Firstly, we want to remove any venues that are missing latitude and longitude information because these will fail to show on our eventual map.
nrow(venue_df)## [1] 125venue_df_new <- venue_df %>% filter(!is.na(venue_lat) & !is.na(venue_lon))
nrow(venue_df_new)## [1] 117venue_df_new <- venue_df_new %>%
  filter(venue_lat != "" |venue_lon != "")
nrow(venue_df_new)## [1] 117Secondly, the latitude and longitude data are currently of the class ‘factor’, but we will need to be numeric. We can resolve this using the unfactor function from the varhandle package.
class(venue_df_new$venue_lat)## [1] "factor"class(venue_df_new$venue_lon)## [1] "factor"venue_df_new$venue_lat <- unfactor(venue_df_new$venue_lat)
venue_df_new$venue_lon <- unfactor(venue_df_new$venue_lon)
class(venue_df_new$venue_lat)## [1] "numeric"class(venue_df_new$venue_lon)## [1] "numeric"The resulting dataset venue_df_new is now ready for mapping.
6. Create a Music Venue Map for your city
This final step will show you how to take the data you have gathered from the Songkick API and use it to create an interactive map of music venues in your city using the leaflet package. It will first walk you through some basic steps to show you how to create a simple map in leaflet, before then using the dataset created in steps 1-5 to populate a music venue map of your city.
To create the music venue map we will be using the leaflet package. For more information on leaflet - and an excellent introductory tutorial - visit the Leaflet for R page. Everything I have learned in order to create this map came from that page, so I can recommend it. I have provided a potted version here of that tutorial here to help you understand how leaflet will use the information from the venue_df_new dataframe.
First you will need to install the package.
#install.packages('leaflet')
library(leaflet)Then, to create a basic map you will need the latitude and longitude info for a given location - in this example I have used the location of my office at Birmingham City University. The code below creates a leaflet object called m (but you can call it whatever you want). It then adds two things to that object:
- addTiles - is the map.
- addMarkers - is the point on the map.
Because we have not added any other arguments or instructions, both arguments are using the default settings. OpenStreetMap will be used by the addTiles argument, and a ‘pin’ marker will be used to show the specified location.
m <- leaflet() %>%
  addTiles() %>%  # Add default OpenStreetMap map tiles
  addMarkers(lng=-1.88405, lat=52.48356)
m  # Print the mapWe can now build on this simple map above by adding a few more elements:
- the SetView() function sets the centre of the map (I have chosen the same location), and the amount the map should zoom there should be into that point (the higher the number the closer the zoom).
- the addTiles argument has been replaced by the addProviderTiles argument, which allows you to select a number of different map formats. I have selected the CartoDB.Poistron tile. To see a full list of available tiles run names(providers) in your console.
- addMarkers has been replaced by addCircleMarkers, which has a number of additional arguments that can be used to style the location markers. You can see that I have set the color to red, the radius to 6, and so on.
 
- the popup argument of the addCircleMarkers function will show the text you provide if a user clicks on the location pin. The hover argument behaves in a similar way.
You can experiment with these settings by selecting different provider tiles, changing the hover/pop-up text, and the styling of the dots. See the Leaflet for R page on Markers for more detailed information on Markers. The Leaflet for R page on Pop-Ups is also a useful resource and will explain how you can populate the contents of a pop-up window from a dataframe. This is exactly what we will be doing with our map.
m <- leaflet() %>% setView(lng = -1.88405, lat = 52.48356, zoom = 14)
m %>% addProviderTiles(providers$CartoDB.Positron) %>% 
  addCircleMarkers(lng=-1.88405, lat=52.48356, popup="My Office", label = "BCU",
                   stroke = FALSE, 
                   color = "red",
                   fillOpacity = 0.5,
                   radius = 6)To create our music venue map we can call on the contents of the venue_df_new dataframe to populate the lng and lat arguments of the addCircleMarkers function, and to create the text that appears in the in the popup and label arguments.
To do this we first create a template for the popup windows. Here we are calling various elements of the venue_df_new dataframe, including those containing information related to a venue name, capacity, address, and so on. These elements are combined with some static text and other formatting, such as line breaks to create an object called content.
#Create the template for Pop-Up argument. 
content <- paste("<h3>", venue_df_new$venue_name, "</h3>",
                 "<b>", "Capacity: ", "</b>", venue_df_new$venue_capacity, "<br>",
                 "<b>", "Address:", "</b>", venue_df_new$venue_address, venue_df_new$venue_address_two, venue_df_new$venue_songkick_city_name, "<br>",
                 "<b>", "Postcode: ", "</b>", venue_df_new$venue_p_code, "<br>",
                 "<b>", "Website:", "</b>", "<a href='", venue_df_new$venue_website, "'>", venue_df_new$venue_website, "</a>", "<br>",
                 "<b>", "Phone: ", "</b>", venue_df_new$venue_phone, "<br>",
                 "<b>", "Description: ", "</b>", venue_df_new$venue_description, "<br>"
)Finally, we can now begin building our map!
The code below differs from the basic example above in the following ways to create the music venue map:
- Firstly, we tell leaflet to look for the venue_df_new dataframe - leaflet(venue_df_new)
- Next, we can use the city_lon and city_lat objects we created back in Step 2 to set the centre of the map.
- For the addProviderTiles argument this time I have selected a different provider (CartoDB.Voyager)
- In the addCircleMarkers function the lng and lat arguments are now drawn from variables in the venue_df_new dataset - note the ~ symbol next to the variable names. Likewise, the label argument is now populated by the venue_name variable.
- The popup argument is now populated by the content object we created above.
- The circles on the map are now ‘green’, rather than ‘red’, as this matches the CartoDB.Voyager colour scheme slightly better.
- Finally, the addlogo() function from the leafem package has been used to add a 25x25 SongKick logo into the bottom left corner of the map.
img = "https://assets.sk-static.com/assets/images/nw/static-pages/styleguide/sk-white-badge.88f228a4a2389c54e1ee556ad6e3f1f7.jpg"
m <- leaflet(venue_df_new) %>%
  setView(lng = city_lon, lat = city_lat, zoom = 14) %>%
  addProviderTiles(providers$CartoDB.Voyager) %>%
  addCircleMarkers(~venue_lon, ~venue_lat, popup = ~content, 
                   label = ~venue_name, 
                   color = 'green',
                   stroke = FALSE, fillOpacity = 0.8,
                   radius = 6) %>%
    leafem::addLogo(img, src = "remote", url = "https://www.songkick.com/developer",
                  position = "bottomleft",
                  offset.x = 7,
                  offset.y = 40,
                  width = 25,
                  height = 25)
mObservations and next steps…
The process of learning how to call data from the Songkick API and then prepare it for use in a map has been enjoyable and satisfying. I started with a basic question (how can I plot a map of music venues?) and ended up with a solution. I also picked up some new skills along the way. What can I observe from this process, and what would I like to do next?
- This process will certainly help me when I come to use the APIs of other music services. I’m currently looking at running some analysis of my Spotify playlists, and of my record collection in Discogs. Both services have APIs that behave in similar ways to Songkick’s, so I would like to see if I could adapt the first part of the process above to explore data from those services. I enjoy the process of working with R, but I am aware also that my code can sometimes take a ‘scenic route’ - making the code tigher and more efficient will also be something I will work on. 
- In terms of the data retrieved from Songkick, I feel this has been a really useful first attempt to look at the landscape of live music venues in the city. From a quick exploration of the map it seems that all of the city’s large and/or famous venues are present, and being able to quickly retrieve information about latitude and longitude, the venue capacity, etc., meant that the mapping processs was easier than it may otherwise have been. As you can see from the pop-ups when exploring the map, however, some information is missing from venues in the Songkick database; this is certainly something we will look at in the research projct when talking with venue owners. Another thing to look at will be why the coverage of venues outside of the city centre is so sparse. There are certainly numerous pubs, clubs and other venues in Birmingham’s various suburbs that regularly host live music, but they are not present in the data. There may be a number of reasons for this, but it is perhaps because these are venues that are not plugged in to the industrial systems (ticket retailers, booking agents, etc.) that make them visible to globally-operating services such as Songkick. Another (and perhaps bigger) question to tackle relates to how and in what ways we can use 3rd party, commercial data in academic research projects. There are some restictions imposed by Songkick, for instance, but there are also broader questions of data ownership, privacy, use, sharing, etc. 
- The leaflet mapping package seems very powerful and easy to use. My next steps with that element of the process will be to see what additional layers could be added to the map, and what additional data could be added to the pop-up layers (and how and if that can be done, given the restrictions mentioned above). Here I will be looking at transport links, data from local councils around live music, and so on. 
- Finally - and assuming I can work my way through the tasks above - I would like to build a Shiny application that would allow others to explore their own cities. This is something I will chip away at. 
I hope you’ve found this post useful. Please do feel free to share or use the code provided, and please also point out ways you think it could be improved (I’m sure there are several!) If you would like to talk to me about this work, or to discuss potential collaborations / other work, drop me a line or say hello on Twitter.
Happy music mapping!
