How to Analyse and Make Sense of Humongous Datasets

Wrap-up: 2015 AAG Annual Meeting in Chicago

I’m currently sitting at Chicago’s O’Hare airport waiting for my flight back home to Germany. In an attempt to both not forget too much of it too soon and at the same time to keep me awake so I can sleep well on the plane I will now try to craft a wrap-up of my AAG 2015. I’ll start with some details about the sessions I visited and will finish with a more general recap.
Upcoming Event: 2015 Annual Meeting of the Association of American Geographers (AAG)

While there’s still some time until the 2015 AAG Annual Meeting kicks off in Chicago next spring the deadline for submitting papers is approaching almost here: November 20th, 2014!

As for me, I will present an algorithm I developed as part of my PhD thesis and in the course of my related research of people’s movements in urban areas:

Konstantin Greger, University of Tsukuba
A Spatio-Temporal Betweenness Centrality Measure for the Micro-Scale Estimation of Pedestrian Traffic

The spatio-temporal mobile population estimation approach I introduce here can be used to calculate an index for the pedestrian traffic volume on street segments divided into deliberately chosen time steps. This is especially useful in the spatial context of highly urbanized areas, as it provides the populations in public space as a complementary element to building populations.

This was achieved by employing a graph theory methodology, namely that of betweenness centrality, and extending it by the temporal dimension. This new model was then applied using a number of datasets that provide information about building populations and train station passenger transfers segregated both spatially and by time.

The introduction of the temporal dimension to the estimation of populations in public space allows for a micro-scale analysis of the actual population figures according to the underlying human activities. I believe that this is the most interesting characteristic of the proposed estimation methodology, since for the first time it allows for a reliable estimation of mobile populations even for large study areas with justifiable requirements in terms of both necessary input data and computational expense.

The output result of the spatio-temporal model can be used to visualize the amount of pedestrians on the streets of a chosen study area. While the data do not represent the absolute numbers of pedestrians, they do reflect the traffic volume and allow for a comparison of crowdedness, which can be used for further quantitative analyses, such as population density calculations for certain points in time.

This year I made an effort to not being placed into some random session as has happened to me both in 2012 and 2014 – in 2013 I went all the way and organized my very own session. Therefore I browsed the (admittedly a wee bit confusing) “abstract and session submission console” on the AAG conference website. There I came across an effort by Prof. Diansheng Guo at the University of South Carolina, who proposed a session (or a series thereof?) labeled “Spatial Data Mining and Big Data Analytics”. I was more than happy to receive an almost instantaneous feedback from Prof. Guo, let alone a positive one!

Obviously I don’t have details about the “where and when”s of said session(s) and my presentation, but I will update this article accordingly once the information has become available. The details are:

Paper Session: Spatial Data Mining and Big Data Analytics (2)
Tuesday, 4/21/2015 10:00 AM – 11:40 AM
304 Classroom, University of Chicago Gleacher Center, 3rd Floor

In the meantime, Here’s the general conference information:

2015 AAG Annual Meeting
April 21 – 25, 2015
Hyatt Regency Chicago

I’m already looking forward to my fourth AAG, and I would be happy to see you there!

The Times They Are a-Changin’

Back in November I had big plans to use my supposedly growing spare time to write here on my website, but life has told me otherwise. The past five months have been anything but relaxing, even though I postponed the finishing-up of my PhD thesis for that long. The time was filled by extending the research of my thesis by two completely new topics, by working on two publications and by actively participating in three international conferences.

Here’s a quick run-down, since I didn’t even announce the conferences as I normally do:


  • I successfully submitted a paper to Transactions in GIS, one of the major GIS journals out there:

    Greger, K. (forthcoming). Spatio-Temporal Building Population Estimation for Highly Urbanized Areas Using GIS. Transactions in GIS. Link

    I don’t know yet, when it will be printed, but it’s available as an online early view. I will write a lot more about this in the near future.

  • A concept paper about a research project we recently started at my lab was published in a Japanese publication – the article itself is in English, though:

    Greger, K.; Murayama, Y. (2014). Collection Methods for Spatio-Temporal Personal Movement Data. 平成25年度多目的統計データバンク年報 (Annual Report on the Multi Use Social and Economic Data Bank) Vol.91, 63-83. PDF

  • Also, while it didn’t originate during said timeframe, a book chapter which I co-authored has been published in the meantime:

    Kubo, T.; Yamamoto, T.; Mashita, M.; Hashimoto, M.; Greger, K.; Waldichuk, T.; Matsui, K. (2013). The Relationship Between Community Support and Resident Behavior After the Tohoku Pacific Earthquake: The Case of Hitachi City in Ibaraki Prefecture. In: Neef, A.; Shaw, R. (Eds.) Risks and Conflicts: Local Responses to Natural Disasters. Is: Community, Environment and Disaster Risk Management Vol. 14, 11-42. Emerald Group Publishing Limited. Link


  • In November I presented my bicycle commuting research project at the University of Tokyo CSIS Days 2013:

    Greger, K.; Murayama, Y. (2013). Spatio-Temporal Analysis of Bicycle Commuting Behavior in the Greater Tokyo Area Using a Micro-Scale Persontrip Database. 2013年全国共同利用研究発表大会 (Proceedings of the Session of Inter-University Research Activities in Japan “CSIS Days 2013”). Abstract, Poster

  • At the same conference another paper which I co-authored was presented:

    Murayama, Y.; Lwin, K.; Greger, K.; Estoque, R.; Kubo, T. (2013). 位置情報付きのビックデータ(パーソントリップ調査)をWeb-GISでハンドリングする (Handling Big Data with Locational Information (from a Persontrip Survey) using Web-GIS.) Presented at the Session of Inter-University Research Activities in Japan “CSIS Days 2013” on Nov 22. Kashiwa, Japan. (Japanese) Abstract

  • I mentioned in a brief announcement that I would also join the 2014 Annual Meeting of the AAG:

    Greger, K.; Murayama Y. (2014). Spatio-Temporal Analysis of Bicycle Commuting Behavior in the Greater Tokyo Area Using a Micro-Scale Persontrip Database. Presented at the 2014 Association of American Geographers’ Annual Meeting on Apr 9. Tampa, FL. Abstract

  • One week ago I presented the bicycle project at the 2014 Meeting of the Japan Geoscience Union (JpGU):

    Greger, K.; Murayama Y. (2014). Spatio-Temporal Analysis of Bicycle Commuting Behavior in the Greater Tokyo Area Using a Micro-Scale Persontrip Database. Presented at the 2014 Japan Geoscience Union (JpGU, 日本地球惑星科学連合) Meeting on April 28. Yokohama, Japan. Abstract

  • Lastly, another paper which I co-authored was presented at the JpGU Meeting:

    Murayama, Y.; Lwin, K.; Greger, K.; Estoque, R.; Kubo, T. (2014). 非集計パーソントリップデータをWeb-GISでハンドリングする(Handling Non-Aggregated Person Trip Data with Web-GIS.) Presented at the 2014 Japan Geoscience Union (JpGU, 日本地球惑星科学連合) Meeting on April 28. Yokohama, Japan. (Japanese) Abstract

That’s all nice and fine, but the most important thing is was the work on my PhD thesis. I’m very proud to be able to announce hereby that the work is done! In addition I also successfully defended the dissertation in late April. Now all that’s left for me to do is to present the thesis contents one final time to a public audience (this will happen on May 9, 2014 – but, I don’t know the details yet) and make the thesis itself ready for press.

I’m aware that this sounds like a copy & paste from said article in November, but I’m very positive that from now on my duties should leave enough time to finally write more contents here on the website. There’s so much I want to write about! A non-comprehensive list:

  • A general introduction of my PhD research and thesis contents.
  • An introduction of the novel methodologies I developed in the course of my dissertation. One of them is the topic of the aforementioned article in Transactions in GIS, and I’m planning on publishing a few others in journals as well.
  • Some contents about terrorism in Japan. This is also part of my PhD research, but I have a publication about this topic in mind as well.
  • More details about the bicycle project and the progress it has made over the past months.
  • An introduction about the hybrid movement data collection process I introduced in the second publication mentioned above.
  • An introduction of a new research project we have recently started at my lab (the academic year in Japan starts in April). This involves a lot of data analysis, so there should be some interesting applied contents here.
  • A number of other applied topics that came up during either my PhD research or one of the other research projects I’m involved in…

So please stay tuned and expect the contents of this website to expand seriously in the near future!

Upcoming Event: 2014 Annual Meeting of the Association of American Geographers (AAG)

Readers of my blog know that I have been attending the Annual Meeting of the Association of American Geographers (AAG) for the past two years. While these meetings took place in some of the largest and most international cities in the USA (New York in 2012 and Los Angeles in 2013), the AAG decided to hold the 2014 meeting in Tampa, FL. There’s nothing wrong with this – I love Florida and it should be nice and warm there in early April – but I can’t help but be a little afraid that the city will be more or less overrun by geographers over the course of the conference week. Also, there seem to be mostly two types of accommodation in Tampa: either luxury hotels that break my budget (even at the “discounted conference rate” of USD 199 per night in select hotels) or shady motels far away from the conference venue…

