Meta-data from photo-sharing websites such as Flickr can be used to obtain rich bag-of-words descriptions of geographic locations, which have proven valuable, among others, for modelling and predicting ecological features. One important insight from previous work is that the descriptions obtained from Flickr tend to be complementary to the structured information that is available from traditional scientific resources. To better integrate these two diverse sources of information, in this paper we consider a method for learning vector space embeddings of geographic locations. We show experimentally that this method improves on existing approaches, especially in cases where structured information is available.
Spatiotemporal modelling is an important task for ecology. Social media tags have been found to have great potential to assist in predicting aspects of the natural environment, particularly through the use of machine learning methods. Here we propose a novel spatiotemporal embeddings model, called SPATE, which is able to integrate textual information from the photo-sharing platform Flickr and structured scientific information from more traditional environmental data sources. The proposed model can be used for modelling and predicting a wide variety of ecological features such as species distribution, as well as related phenomena such as climate features. We first propose a new method based on spatiotemporal kernel density estimation to handle the sparsity of Flickr tag distributions over space and time. Then, we efficiently integrate the spatially and temporally smoothed Flickr tags with the structured scientific data into low-dimensional vector space representations. We experimentally show that our model is able to substantially outperform baselines that rely only on Flickr or only on traditional sources.
We propose a method which uses Flickr tags to predict a wide variety of environmental features, such as climate data, land cover categories, species occurrence, and human assessments of scenicness. The role of Flickr tags in our method is two-fold. First, we show that Flickr tags capture information which is highly complementary to what is found in traditional structured environmental datasets. By combining Flickr tags with traditional datasets, we can thus make more accurate predictions than is possible using either Flickr tags or traditional datasets alone. Second, we propose a collective prediction model which crucially relies on Flickr tags to define a neighbourhood structure. The use of a collective prediction formulation is motivated by the fact that most environmental features are strongly spatially autocorrelated. While this suggests that geographic distance should play a key role in determining neighbourhoods, we show that considerable gains can be made by additionally taking Flickr tags and traditional data into consideration.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.