I had been planning to write this post for a long time. Not only have I been asked many times how to do this in my daily consulting work, but especially during and after my hands-on training “Stretching the Boundaries with Advanced Mapping” at our Tableau Conference On Tour 2017 in Berlin earlier this year. The question is pretty simple: How can I draw paths in Tableau? Oftentimes these are some kind of movement data, e.g. refugees or flight connections. The way to do this in Tableau is actually very easy – and some of the recently introduced features made it even easier – but it’s imperative to understand how Tableau draws lines and how the data therefore needs to be structured.
You can either read through this or, if you’re more the visual type, jump right to the end of this post where I prepared a little video that guides you through the steps I outlined in the following 750 words.
Step 1: Get the data
For this tutorial I decided to use a data set I have also used in another context: the OpenFlights.org data. It’s available for free, but since it’s really great, very comprehensive, and a lot of effort must have gone into preparing it please consider donating a few bucks to the guys developing and maintaining it! They offer the underlying data to be downloaded directly from the website, for now we’re only interested in the airport data and obviously the routes data. Two minor annoyances here:
- The files are comma separated but use
datas file extension. Normally not a big deal (and Tableau can read them) but to keep things clean I personally prefer to rename them to
- The files are missing headers. Again, technically not a problem for Tableau, but if you (like me) prefer meaningful field names to
F2, etc, feel free to prepend (yes, that’s a word!) the following header lines to the two files. Another option would be to rename the fields in Tableau, but again that sounds like tedious monkey work…
Headers for file
"Airport ID","Name","City","Country","IATA","ICAO","Latitude","Longitude","Altitude","Timezone","DST","Tz database time zone","Type","Source"
Headers for file
"Airline","Airline ID","Source airport","Source airport ID","Destination airport","Destination airport ID","Codeshare","Stops","Equipment"
Step 2: Build the model
This is the critical bit. Or as I like to put it: This is where the magic happens. We’re going to do a union. Actually a self-union. And a join. And we’ll use Join Calculations! Get ready for some data prep right inside Tableau.
- Connect to the file
- Union the file
routes.csvto itself by dragging it out again and making sure you hit the orange “Drag table to union” box underneath the
Note how this automatically adds a new generated field
[Table Name]– we will need this in the next step.
- Create a new field
[Path ID]with the following formula:
IF RIGHT([Table Name], 1) == "1" THEN 2 ELSE 1 END
This will look for the records that came in via the second instance of file
routes.csv, as those show the string
routes.csv1in the field
[Table Name]. If that’s
TRUEwe want the
[Path ID]to be
2(the end of our lines) otherwise
1(vice-versa the beginning of our lines).
- Join with file
airports.csv. The join will not use a regular join condition, but the Join Calculation feature we introduced in Tableau Desktop 10.2! It doesn’t really matter at this stage, but just to be safe, make sure it’s a left join.
In our case the join condition will follow the same logic as the new field
[Path ID]we just created. Only this time we need to make sure to join the respective airport information from the file
[Source airport ID]or the
[Destination airport ID]of the self-union
IF RIGHT([Table Name], 1) == "1" THEN [Destination airport ID] ELSE [Source airport ID] END
The Join Calculation will have to match the field
[Airport ID]from table
airports.csv. In order to exclude routes that don’t have matching airports as well as airports that don’t have routes assigned I suggest opting for an Inner Join here. This will later allow us to plot lines from the latitude/longitude of the source airport, coming from
[Source airport ID], to the latitude/longitude of the destination airport, coming from
[Destination airport ID], using
[Path ID]to draw from source (
1) to destination (
- Now we can draw our map! From the list of measures from data source
[Latitude]to Rows. This will get you one dot on the map. A good start. In order to break it up to show all the separate airports drag out
[Source airport]as well as
[Destination airport]from the list of dimensions from data source
routes.csv+onto the Detail card. That’s not half bad! To actually draw the lines change the mark type using the dropdown list on the Marks card to “Line”. Whoa, that’s rough! Let’s set the drawing order correctly by first making te measure
[Path ID]a dimension (by just dragging it up to the dimensions area) and then assigning it to the Path card. Hence the title “Connecting the dots” – in case you were wondering…
- That’s the result we’re looking for, but it’s still a bit of a mess. Therefore let’s set the Size of the lines to the minimum and change their opacity to 10%. You can do both changes by clicking directly on the Size and Color cards, respectively.
Step 3: Celebrate
You just built a very cool dataviz showing all the flight connections between airports worldwide. Now get creative. A fun idea is to check the route network of your favorite carrier: Just show a filter control for the dimension
[Airline]. Or filter for all routes that are served by your favorite aircraft type. This one is a bit tricky, but I’m sure you can do it. Hint: Use the dimension
[Equipment] and some clever string functions or wildcard filters.
Let me know if you enjoyed this and also what kind of data you used this method on! We’re all interested in seeing your vizzes, so please make sure to leave a link to Tableau Public in the comments. Oh, and here’s the video I promised in the beginning that shows the whole process: