Since users of location sharing services are not required to register a home location, we must algorithmically determine the home location. 2 Locating Each User’s “Home”: Some of the analysis in the following sections requires that we first associate each user with a natural “home”, so, for example, we can compare the properties of all users “from” New York City versus users “from” Los Angeles. More than 72% users have fewer than 100 checkins 7.8% users have more than 300 checkins and 3.6% users have more than 500. In total, we filtered 294 users (0.1%) with sudden moves, yielding a final collection of 224,804 users and 22,388,315 checkins. Hence, we additionally filter out all checkins from users whose consecutive checkins imply a rate of speed faster than 1000 miles-per-hour (or faster than an airplane). Filtering Noise: Many location sharing services provide some mechanism to verify that a user is actually at or near the venue where they are checking in (e.g., by cross- checking with a user’s cellphone GPS) (Foursquare 2010), however, there can still be incidents of false checkins. Format of the Data: Each checkin is stored as the tuple checkin(userID, tweetID) = for the example checkin, the user has 2,771 total status updates, 255 followers and is following 926 users. A few hundred thousands checkins are from other location sharing services like Gowalla, Echofon, and Gravity. More than 53% of the checkins are from Foursquare, and most of the other checkins are from Twitter’s applications on mobile plat- forms like Blackberry, Android, and iPhone. The 22 million checkins were posted from more than 1,200 applications, and the distribution of sources is displayed in Table 1. The location crawler ran from late September 2010 to late January 2011, resulting in a total collection of 225,098 users and 22,506,721 unique checkins. For each sampled user, we crawl up to a maximum of the most recent 2,000 geo-labeled tweets. We monitor Twitter’s gardenhose streaming API ( ∼ 1% of the entire Twitter public timeline), and retrieve users who post geo-tagged status updates. Twitter status messages support the inclu- sion of geo-tags (latitude/longitude) as well as support third- party location sharing services like Foursquare and Gowalla (where users of these services opt-in to share their checkins on Twitter).
Since personal checkin information on location sharing services like Foursquare, Gowalla, and Facebook Places is typically restricted to a user’s immediate social circle (and hence unavailable for sampling) we take an approach in which we sample location sharing status updates from the public Twitter feed. To begin our study, we first require a collection of checkins. (e.g., a restaurant), al- lowing for greater analysis of venue type (iii) checkins can be augmented with short messages, providing partial insight into the thoughts and motivations of users of these services.