Zipf-like distributions characterize a wide set of phenomena in physics, biology, economics, and social sciences. In human activities, Zipf's law describes, for example, the frequency of appearance of words in a text or the purchase types in shopping patterns. In the latter, the uneven distribution of transaction types is bound with the temporal sequences of purchases of individual choices. In this work, we define a framework using a text compression technique on the sequences of credit card purchases to detect ubiquitous patterns of collective behavior. Clustering the consumers by their similarity in purchase sequences, we detect five consumer groups. Remarkably, post checking, individuals in each group are also similar in their age, total expenditure, gender, and the diversity of their social and mobility networks extracted from their mobile phone records. By properly deconstructing transaction data with Zipf-like distributions, this method uncovers sets of significant sequences that reveal insights on collective human behavior.
In the last decade, the digital age has sharply redefined the way we study human behavior. With the advancement of data storage and sensing technologies, electronic records now encompass a diverse spectrum of human activity, ranging from location data 1, 2 , phone 3, 4 and email communication 5 to Twitter activity 6 and open-source contributions on Wikipedia and OpenStreetMap 7, 8 . In particular, the study of the shopping and mobility patterns of individual consumers has the potential to give deeper insight into the lifestyles and infrastructure of the region. Credit card records (CCRs) provide detailed insight into purchase behavior and have been found to have inherent regularity in consumer shopping patterns 9 ; call detail records (CDRs) present new opportunities to understand human mobility 10 , analyze wealth 11 , and model social network dynamics 12 .Regarding the analysis of CDR data, there exists a wide body of work characterizing human mobility patterns. As a notable example, 10 describes the temporal and spatial regularity of human trajectories, showing that each individual can be described by a time independent travel distance and a high probability of returning to a small number of locations. Further, the authors are able to model individual travel patterns using a single spatial probability distribution. There has also been work at the intersection of similar datasets, such as the inference of friendships from mobile phone data 13 , or the analysis such data in relation to metrics on spending behavior such as diversity, engagement, and loyalty 14 . Recent work 15 uses the Jaccard distance as a similarity measure on motifs among spending categories, then applies community detection algorithms to find clusters of users. These studies propose models for either mobility or spending behavior, but not in conjunction.The only known paper that incorporates both aspects 16 frames its analysis only on an aggregate scale of city regions. However, the coupled collaborative filtering methods (also known as collective matrix factorization) used in 16 have been successfully applied in a variety of urban computing applications for data fusion and prediction [17][18][19] , from location-based activity recommendations 20, 21 to travel speed estimation on road segments 22 . Recent work includes methods that use Laplacian regularization 23 to leverage social network information, and use geometric deep learning matrix completion methods to model nonlinearities 24 .In this chapter, we jointly model the lifestyles of individuals, a more challenging problem with higher variability when compared to the aggregated behavior of city regions. Using collective matrix factorization, we propose a unified dual view of lifestyles. Understanding these lifestyles will not only inform commercial opportunities, but also help policymakers and nonprofit organizations understand the characteristics and needs of the entire region, as well as of the individuals within that region. The applications of this range from targeted advertisements a...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.