Connecting the Dots – Visualizing Paths in Tableau

I had been planning to write this post for a long time. Not only have I been asked many times how to do this in my daily consulting work, but especially during and after my hands-on training “Stretching the Boundaries with Advanced Mapping” at our Tableau Conference On Tour 2017 in Berlin earlier this year. The question is pretty simple: How can I draw paths in Tableau? Oftentimes these are some kind of movement data, e.g. refugees or flight connections. The way to do this in Tableau is actually very easy – and some of the recently introduced features made it even easier – but it’s imperative to understand how Tableau draws lines and how the data therefore needs to be structured.

The OpenFlights.org route network visualized in Tableau

The OpenFlights.org route network visualized in Tableau

You can either read through this or, if you’re more the visual type, jump right to the end of this post where I prepared a little video that guides you through the steps I outlined in the following 750 words.

Step 1: Get the data

For this tutorial I decided to use a data set I have also used in another context: the OpenFlights.org data. It’s available for free, but since it’s really great, very comprehensive, and a lot of effort must have gone into preparing it please consider donating a few bucks to the guys developing and maintaining it! They offer the underlying data to be downloaded directly from the website, for now we’re only interested in the airport data and obviously the routes data. Two minor annoyances here:

  1. The files are comma separated but use dat as file extension. Normally not a big deal (and Tableau can read them) but to keep things clean I personally prefer to rename them to csv.
  2. The files are missing headers. Again, technically not a problem for Tableau, but if you (like me) prefer meaningful field names to F1, F2, etc, feel free to prepend (yes, that’s a word!) the following header lines to the two files. Another option would be to rename the fields in Tableau, but again that sounds like tedious monkey work…

Headers for file airports.csv

"Airport ID","Name","City","Country","IATA","ICAO","Latitude","Longitude","Altitude","Timezone","DST","Tz database time zone","Type","Source"

Headers for file routes.csv

"Airline","Airline ID","Source airport","Source airport ID","Destination airport","Destination airport ID","Codeshare","Stops","Equipment"

Step 2: Build the model

This is the critical bit. Or as I like to put it: This is where the magic happens. We’re going to do a union. Actually a self-union. And a join. And we’ll use Join Calculations! Get ready for some data prep right inside Tableau.

  1. Connect to the file routes.csv.
  2. Union the file routes.csv to itself by dragging it out again and making sure you hit the orange “Drag table to union” box underneath the routes.csv pill:
    Drag table to union

    Drag table to union


    Note how this automatically adds a new generated field [Table Name] – we will need this in the next step.
  3. Create a new field [Path ID] with the following formula:
    IF RIGHT([Table Name], 1) == "1" THEN 2 ELSE 1 END
    

    This will look for the records that came in via the second instance of file routes.csv, as those show the string routes.csv1 in the field [Table Name]. If that’s TRUE we want the [Path ID] to be 2 (the end of our lines) otherwise 1 (vice-versa the beginning of our lines).

  4. Join with file airports.csv. The join will not use a regular join condition, but the Join Calculation feature we introduced in Tableau Desktop 10.2! It doesn’t really matter at this stage, but just to be safe, make sure it’s a left join.
    Creating a join based on a Calculation

    Creating a join based on a Calculation


    In our case the join condition will follow the same logic as the new field [Path ID] we just created. Only this time we need to make sure to join the respective airport information from the file airports.csv to the [Source airport ID] or the [Destination airport ID] of the self-union routes.csv+:

    IF RIGHT([Table Name], 1) == "1" THEN [Destination airport ID] ELSE [Source airport ID] END
    

    The Join Calculation will have to match the field [Airport ID] from table airports.csv. In order to exclude routes that don’t have matching airports as well as airports that don’t have routes assigned I suggest opting for an Inner Join here. This will later allow us to plot lines from the latitude/longitude of the source airport, coming from [Source airport ID], to the latitude/longitude of the destination airport, coming from [Destination airport ID], using [Path ID] to draw from source ([Path ID] = 1) to destination ([Path ID] = 2).

  5. Now we can draw our map! From the list of measures from data source airports.csv drag [Longitude] to Columns, [Latitude] to Rows. This will get you one dot on the map. A good start. In order to break it up to show all the separate airports drag out [Source airport] as well as [Destination airport] from the list of dimensions from data source routes.csv+ onto the Detail card. That’s not half bad! To actually draw the lines change the mark type using the dropdown list on the Marks card to “Line”. Whoa, that’s rough! Let’s set the drawing order correctly by first making te measure [Path ID] a dimension (by just dragging it up to the dimensions area) and then assigning it to the Path card. Hence the title “Connecting the dots” – in case you were wondering…
  6. That’s the result we’re looking for, but it’s still a bit of a mess. Therefore let’s set the Size of the lines to the minimum and change their opacity to 10%. You can do both changes by clicking directly on the Size and Color cards, respectively.

Step 3: Celebrate

You just built a very cool dataviz showing all the flight connections between airports worldwide. Now get creative. A fun idea is to check the route network of your favorite carrier: Just show a filter control for the dimension [Airline]. Or filter for all routes that are served by your favorite aircraft type. This one is a bit tricky, but I’m sure you can do it. Hint: Use the dimension [Equipment] and some clever string functions or wildcard filters.

Let me know if you enjoyed this and also what kind of data you used this method on! We’re all interested in seeing your vizzes, so please make sure to leave a link to Tableau Public in the comments. Oh, and here’s the video I promised in the beginning that shows the whole process:

9 Comments

        1. Hi Rabih!

          Thanks for waiting for such a long time for my reply -- I finally managed to look into your request. The basic process for your data is the same as the one I outlined in my article, but your data requires a few additional steps.

          This is due to the fact that we need the centroid coordinates of all countries, as unfortunately we can’t use Tableau’s built-in geocoding functionality here. So I went to the internet to find a table of all (more or less…) country coordinates (I used this one). Sadly, this table uses 2-digit ISO country codes, whereas your data uses 3-digit codes, so I needed an additional lookup table (I used this one).

          I then went ahead and built a data model in Tableau Desktop (doing this in Tableau Prep Builder would be even easier!) by using the self-union of the “Trades” tab of the Excel file downloaded from your website (I used the 2018 version) and joining first the country code table and then the country coordinate table. They don’t match up 100%, so for simplicity I used inner joins here -- but you should obviously check for data quality so you don’t lose information at that step… Next, following my explanation, drag out [Longitude] and [Latitude] on Rows and Columns, [Exporter] and [Importer] on Detail, change to Line mark type, make [Path ID] a dimension and drag it to Path, drag SUM([Value]) (or [Weight]…) to Size, and take care of the cosmetics. With your data I recommend also to filter to a minimum value for [Value] so your map isn’t too cluttered.

          I hope this brief explanation was helpful. I also uploaded the 2018 Excel file from your original data source, as well as the two additional tables I used, and an example workbook here -- feel free to download them and start your journey from there (~47 MB).

          Regards
          // Konstantin

          Reply

  1. Hello Konstantin! Thank you for the wonderful tutorial, it really helped me build truck routes based off of what you showed here. I wondering how can we highlight the source and destination airports along with the routes and also the number of flights within a certain duration when we hover over the Source or destination airport, considering the data is available.

    Thank you once again for the help!

    Looking forward to your response.

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

*

This site uses Akismet to reduce spam. Learn how your comment data is processed.