CDR (Call Detail Record) data are one type of mobile phone data collected by operators each time a user initiates/receives a phone call or sends/receives an sms. CDR data are a rich geo-referenced source of user behaviour information. In this work, we perform an analysis of CDR data for the city of Milan that originate from Telecom Italia Big Data Challenge. A set of graphs is generated from aggregated CDR data, where each node represents a centroid of an RBS (Radio Base Station) polygon, and each edge represents aggregated telecom traffic between two RBSs. To explore the community structure, we apply a modularity-based algorithm. Community structure between days is highly dynamic, with variations in number, size and spatial distribution. One general rule observed is that communities formed over the urban core of the city are small in size and prone to dynamic change in spatial distribution, while communities formed in the suburban areas are larger in size and more consistent with respect to their spatial distribution. To evaluate the dynamics of change in community structure between days, we introduced different graph based and spatial community properties which contain latent footprint of human dynamics. We created land use profiles for each RBS polygon based on the Copernicus Land Monitoring Service Urban Atlas data set to quantify the correlation and predictivennes of human dynamics properties based on land use. The results reveal a strong correlation between some properties and land use which motivated us to further explore this topic. The proposed methodology has been implemented in the programming language Scala inside the Apache Spark engine to support the most computationally intensive tasks and in Python using the rich portfolio of data analytics and machine learning libraries for the less demanding tasks.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited