The user clients for accessing Internet are increasingly shifting from desktop computers to cellular devices. To be competitive in the rapidly changing market, operators, Internet service providers and application developers are required to have the capability of recognizing the models of cellular devices and understanding the traffic dynamics of cellular data network. In this paper, we propose a novel Jaccard measurement-based method to recognize cellular device models from network traffic data. This method is implemented as a scalable paralleled MapReduce program and achieves a high accuracy, 91.5%, in the evaluation with 2.9 billion traffic records collected from the real network. Based on the recognition results, we conduct a comprehensive study of three characteristics of network traffic from device model perspective, the network access time, the traffic volume, and the diurnal patterns. The analysis results show that the distribution of network access time can be modeled by a two-component Gaussian mixture model, and the distribution of traffic volumes is highly skewed and follows the power law. In addition, seven distinct diurnal patterns of cellular device usage are identified by applying unsupervised clustering algorithm on the collected massive traffic data. RECOGNIZING AND CHARACTERIZING DYNAMICS OF CELLULAR DEVICES 1885 a three-step method to identify an appropriate keyword from unformatted textual HTTP headers to represent a cellular device model. The first step is extracting all keywords that are possibly to be the description of a device model. The second step is selecting a small set of candidate keywords by evaluating the conditional probability of each keyword given a device model to decrease the computational workload. At the last step, a Jaccard coefficient [6] value of each candidate keyword is calculated, and the keyword with the largest value is selected to represent a device model. To meet the requirements of performance and scalability to handle billions of HTTP records, the algorithm is implemented as a paralleled program based on MapReduce programming model. Evaluation result in the real network shows that the proposed method can achieve pretty high recognition accuracy.With the capability of recognizing cellular device models, we study the traffic of cellular data network from a new perspective, the device model perspective. Our goal is to understand whether and how the traffic differs from cellular device models. In particular, we investigate the following three basic characteristics of different device models, the distribution of network access time, the distribution of traffic volume, and the diurnal usage patterns.The following key contributions of this paper are twofold: