# Pedestrian Flows Characterization and Estimation with Computer Vision Techniques

^{1}

^{2}

^{*}

## Abstract

**:**

^{2}with a standard deviation of about 0.014 person/m

^{2}. On the other hand, two main speed clusters were identified during morning/evening hours. The largest number of pedestrians with an average speed of about 0.77 m/s was observed along the exit direction of the subway entrances during both morning and evening hours. The second relevant group of pedestrians was observed walking in the opposite direction with an average speed of about 0.65 m/s. The analyses generated initial insights into the future development of a decision-support system to help with the management and control of pedestrian dynamics.

## 1. Introduction

## 2. The Computer-Vision Model

#### 2.1. The Detection Process

#### 2.2. Tracking Model

## 3. Experimental Setup

## 4. Metrics

_{50}) is calculated as the mean average precision of all the different classes of objects detected within a single image, based on the following expression:

## 5. Image Processing

#### 5.1. Speed and Direction

^{2}cell area was overlaid on the observation area to achieve regular tessellation to visualize speed and density values in different parts of the square. This is a different approach with respect to a similar study [10], in which the Voronoi density was computed on the original Voronoi cells, with no reference to the regular grid used in the study.

#### 5.2. Density Analysis

^{2}was overlaid on the observed area and intersected with the Voronoi cells. In a situation considered homogeneous, densities estimated with the Voronoi methodology do not show considerable variations; on the contrary, they are defined to highlight possible inhomogeneities in the density distribution. This latter condition was apparent during the whole observation period. Therefore, the Voronoi density was estimated considering the ratio between the number of people within each square cell and the total areas of the Voronoi cells intersecting the square cell itself. At timestamp $t$, the Voronoi density for each cell ${A}_{k}$ of the regular square grid is defined as:

^{2}. This result is consistent with the time sequence of the Voronoi density shown in Figure 13. On the other hand, when considering the entire area of the square, the mean value of the Voronoi density was about 0.035 person/m

^{2}, with a standard deviation of about 0.014 person/m

^{2}(Figure 8b). This value is consistent with the one found in the previous analysis carried out at the same location during a different time period [10]. Compared to typical results on vehicles, pedestrian densities estimated in this work showed very low variability. In the heat map of Figure 14, the density variability ranged from 0.02 person/m

^{2}to 0.16 person/m

^{2}. Therefore, unlike the case of vehicles, the low value for the spread of density was not suitable for performing speed-density plots to estimate the relationship between walking speed and pedestrian density [36]. However, in another recent work [10], we estimated the speed–density relationship for the same pedestrian environment by taking advantage of a microscopic simulator.

## 6. Conclusions

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Acknowledgments

## Conflicts of Interest

## References

- Zou, Z.; Shi, Z.; Guo, Y.; Ye, J. Object Detection in 20 Years: A Survey. arXiv
**2019**, arXiv:1905.05055. [Google Scholar] [CrossRef] - He, K.; Gkioxari, G.; Dollár, P.; Girshick, R. Mask R-CNN. arXiv
**2018**, arXiv:1703.06870. [Google Scholar] [CrossRef] - Wu, Q.; Shen, C.; Wang, P.; Dick, A.; van den Hengel, A. Image Captioning and Visual Question Answering Based on Attributes and External Knowledge. IEEE Trans. Pattern Anal. Mach. Intell.
**2018**, 40, 1367–1381. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Kang, K.; Li, H.; Yan, J.; Zeng, X.; Yang, B.; Xiao, T.; Zhang, C.; Wang, Z.; Wang, R.; Wang, X.; et al. T-CNN: Tubelets with Convolutional Neural Networks for Object Detection from Videos. IEEE Trans. Circuits Syst. Video Technol.
**2018**, 28, 2896–2907. [Google Scholar] [CrossRef] [Green Version] - Butenuth, M.; Burkert, F.; Schmidt, F.; Hinz, S.; Hartmann, D.; Kneidl, A.; Borrmann, A.; Sirmacek, B. Integrating pedestrian simulation, tracking and event detection for crowd analysis. In Proceedings of the 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), Barcelona, Spain, 6–13 November 2011; pp. 150–157. [Google Scholar]
- Liberto, C.; Nigro, M.; Carrese, S.; Mannini, L.; Valenti, G.; Zarelli, C. Simulation framework for pedestrian dynamics: Modelling and calibration. IET Intell. Transp. Syst.
**2020**, 14, 1048–1057. [Google Scholar] [CrossRef] - Sundararaman, R.; De Almeida Braga, C.; Marchand, E.; Pettré, J. Tracking Pedestrian Heads in Dense Crowd. In Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021; pp. 3864–3874. [Google Scholar]
- LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature
**2015**, 521, 436–444. [Google Scholar] [CrossRef] - Steffen, B.; Seyfried, A. Methods for measuring pedestrian density, flow, speed and direction with minimal scatter. Phys. A Stat. Mech. Its Appl.
**2010**, 389, 1902–1910. [Google Scholar] [CrossRef] [Green Version] - Dumitru, A.; Karagulian, F.; Liberto, C.; Nigro, M.; Valenti, G. Pedestrian analysis for crowd monitoring: The Milan case study (Italy). In Proceedings of the MT-ITS 2023 8th International Conference on Models and Technologies for Intelligent Transportation Systems, Nice, France, 14–16 June 2023. [Google Scholar]
- Lu, Y.-J.; Tang, Y.-Y.; Pirard, P.; Hsu, Y.-H.; Cheng, H.-D. Measurement of Pedestrian Flow Data Using Image Analysis Techniques. Transp. Res. Rec.
**1990**, 1281, 87–96. [Google Scholar] - Jiao, D.; Fei, T. Pedestrian walking speed monitoring at street scale by an in-flight drone. PeerJ Comput. Sci.
**2023**, 9, e1226. [Google Scholar] [CrossRef] - Tokuda, E.K.; Lockerman, Y.; Ferreira, G.B.A.; Sorrelgreen, E.; Boyle, D.; Cesar-Jr., R.M.; Silva, C.T. A new approach for pedestrian density estimation using moving sensors and computer vision. ACM Trans. Spat. Algorithms Syst.
**2020**, 6, 1–20. [Google Scholar] [CrossRef] - Ismail, K.; Sayed, T.; Saunier, N. Automated Collection of Pedestrian Data Using Computer Vision Techniques. 2009. Available online: http://n.saunier.free.fr/saunier/stock/ismail09automated-tac.pdf (accessed on 8 June 2023).
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. arXiv
**2016**, arXiv:1506.02640. [Google Scholar] [CrossRef] - Redmon, J.; Farhadi, A. YOLOv3: An Incremental Improvement. arXiv
**2018**, arXiv:1804.02767. [Google Scholar] - Nepal, U.; Eslamiat, H. Comparing YOLOv3, YOLOv4 and YOLOv5 for Autonomous Landing Spot Detection in Faulty UAVs. Sensors
**2022**, 22, 464. [Google Scholar] [CrossRef] - Lin, T.-Y.; Maire, M.; Belongie, S.; Bourdev, L.; Girshick, R.; Hays, J.; Perona, P.; Ramanan, D.; Zitnick, C.L.; Dollár, P. Microsoft COCO: Common Objects in Context. arXiv
**2015**, arXiv:1405.0312. [Google Scholar] [CrossRef] - Kerner, B.S.; Rehborn, H.; Aleksic, M.; Haug, A. Recognition and tracking of spatial–temporal congested traffic patterns on freeways. Transp. Res. Part C Emerg. Technol.
**2004**, 12, 369–400. [Google Scholar] [CrossRef] - He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. arXiv
**2015**, arXiv:1512.03385. [Google Scholar] [CrossRef] - Jastrzębski, S.; Arpit, D.; Ballas, N.; Verma, V.; Che, T.; Bengio, Y. Residual Connections Encourage Iterative Inference. arXiv
**2018**, arXiv:1710.04773. [Google Scholar] [CrossRef] - Szandała, T. Review and comparison of commonly used activation functions for deep neural networks. In Bio-Inspired Neurocomputing; Bhoi, A.K., Mallick, P.K., Liu, C.-M., Balas, V.E., Eds.; Studies in Computational Intelligence; Springer: Singapore, 2021; Volume 903, pp. 203–224. ISBN 9789811554940. [Google Scholar]
- Bochkovskiy, A.; Wang, C.-Y.; Liao, H.-Y.M. YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv
**2020**, arXiv:2004.10934. [Google Scholar] - Wang, C.-Y.; Bochkovskiy, A.; Liao, H.-Y.M. Scaled-YOLOv4: Scaling Cross Stage Partial Network. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021; pp. 13029–13038. [Google Scholar]
- Balduzzi, D.; Frean, M.; Leary, L.; Lewis, J.P.; Ma, K.W.-D.; McWilliams, B. The Shattered Gradients Problem: If resnets are the answer, then what is the question? arXiv
**2018**, arXiv:1702.08591. [Google Scholar] [CrossRef] - Zeiler, M.D.; Fergus, R. Visualizing and understanding convolutional networks. In Proceedings of the Computer Vision—ECCV 2014; Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T., Eds.; Springer International Publishing: Cham, Swirzerland, 2014; pp. 818–833. [Google Scholar]
- Lin, T.-Y.; Goyal, P.; Girshick, R.; He, K.; Dollár, P. Focal Loss for Dense Object Detection. arXiv
**2018**, arXiv:1708.02002. [Google Scholar] [CrossRef] - Rosebrock, A. Intersection over Union (IoU) for Object Detection. 2016. Available online: https://pyimagesearch.com/2016/11/07/intersection-over-union-iou-for-object-detection/ (accessed on 8 June 2023).
- Bewley, A.; Ge, Z.; Ott, L.; Ramos, F.; Upcroft, B. Simple Online and Realtime Tracking. In Proceedings of the 2016 IEEE International Conference on Image Processing (ICIP), Phoenix, AZ, USA, 25–28 September 2016; pp. 3464–3468. [Google Scholar]
- Kálmán, R. A new approach to linear filtering and prediction problems. J. Basic Eng.
**1960**, 82, 35–45. [Google Scholar] [CrossRef] [Green Version] - Dahua Products. Available online: www.dahuasecurity.com/products/All-Products/Network-Cameras/Consumer-Series/2MP/IPC-HFW1235S-W-S2 (accessed on 12 June 2023).
- Bernardin, K.; Stiefelhagen, R. Evaluating Multiple Object Tracking Performance: The CLEAR MOT Metrics. J. Image Video Process.
**2008**, 2008, 1–10. [Google Scholar] [CrossRef] [Green Version] - Milan, A.; Leal-Taixe, L.; Reid, I.; Roth, S.; Schindler, K. MOT16: A Benchmark for Multi-Object Tracking. arXiv
**2016**, arXiv:1603.00831. [Google Scholar] [CrossRef] - VisAI Labs. Evaluating Multiple Object Tracking Accuracy and Performance Metrics in a Real-Time Setting. Available online: https://visailabs.com/evaluating-multiple-object-tracking-accuracy-and-performance-metrics-in-a-real-time-setting/ (accessed on 22 February 2023).
- Silgu, M.A.; Çelikoğlu, H.B. K-Means Clustering Method to Classify Freeway Traffic Flow Patterns. Pamukkale J. Eng. Sci
**2014**, 20, 232–239. [Google Scholar] [CrossRef] [Green Version] - Yang, X.; Zou, Y.; Chen, L. Operation analysis of freeway mixed traffic flow based on catch-up coordination platoon. Accid. Anal. Prev.
**2022**, 175, 106780. [Google Scholar] [CrossRef] [PubMed]

**Figure 1.**Schematic representation of the YOLOv3 model implemented with the neural network Darknet-53.

**Figure 2.**(

**a**) Camera system for video collection located in the middle of the square Piazza Duca d’Aosta at Centrale Station in Milan, Italy. (

**b**,

**c**) Sample frames extracted from a video sequence. Bounding boxes indicate successfully detected people in the image within a time interval of 10 ms. Each bounding box is associated with a unique identifier.

**Figure 3.**Example of a sequence of bounding boxes tracking a detected person with their ID maintained along the trajectory. For illustration purposes and clarity of the tracked path, only 2 frames per second are shown. Two entrance points of the subway are indicated. Axis labels refer to the pixel scale of the image.

**Figure 4.**Example of a tracked ID along its trajectory. For illustration purposes and clarity of the tracked path, only 2 frames per second are shown. Axis labels refer to the metric system, epsg = 32,624.

**Figure 5.**Distribution values obtained for 3 metrics used to evaluate the YOLOv3 model performance over the observations: confidence of the detection for the class “pedestrian” ($Conf\left(person\right)$), intersection over union (IoU), and mean average precision at IoU = 0.5 (mAP

_{50}) for the class “pedestrian.” Results refer to an observation period of 14 days during the month of April 2022. Blue dotted lines represent the mean value of the distributions.

**Figure 6.**Distribution values obtained for two metrics used to evaluate tracking precision in the YOLOv3 model: Multiple Object Tracking Accuracy (MOTA) and Multiple Object Tracking Precision (MOTP). Results refer to an observation period of 14 days during the month of April 2022. Blue dotted lines represent the mean value of the distributions.

**Figure 7.**(

**a**) Daily and (

**b**) hourly profile of the mean (absolute) number of pedestrians in the observed target area.

**Figure 8.**(

**a**) Distribution of mean speed and (

**b**) Voronoi density during the two-week observation period in April 2022. The ordinate axis label refers to the density distribution, expressed as arbitrary units. Colored dotted lines represent the modeled data distribution.

**Figure 9.**(

**a**) Distribution of the angles followed by pedestrians during morning (from 07:00 to 10:00) and (

**c**) evening hours (from 17:00 to 20:00) for the entire observation period. The 90° angle corresponds to 12:00. (

**b**,

**d**) Distribution of speeds during the same time windows.

**Figure 10.**Clustered speed and directions in (

**a**) the morning (from 07:00 to 10:00) and (

**b**) the evening (from 17:00 to 20:00). The length of the arrows indicates the number of pedestrians for which speed and direction were weighted.

**Figure 11.**Heat map of speed across Piazza Duca d’Aosta during the morning (from 07:00 to 10:00) along the (

**a**) entrance and (

**b**) exit direction with respect to the station. Arrows indicate the ending point of a trajectory of a group of pedestrians together with its direction. Results are from two weeks of observations during the month of April 2022. Directions with high standard deviation were omitted. Red arrows indicate the entrances of the station. Grey arrows represent the access points of the subway.

**Figure 12.**Heat map of speed across Piazza Duca d’Aosta during the evening (from 17:00 to 20:00) along the (

**a**) entrance and (

**b**) exit direction with respect to the station. Arrows indicate the ending point of a trajectory of a group of pedestrians together with its direction. Directions with high standard deviation were omitted. Red arrows indicate the entrances of the station. Grey arrows represent the access points of the subway.

**Figure 13.**Time sequence of the Voronoi density and the standard density computed over a square cell in Piazza Duca d’Aosta with high occupancy during the day of 23 April 2022.

**Figure 14.**Heat map of the Voronoi density across Piazza Duca d’Aosta during the morning (from 07:00 to 10:00) and the evening (from 17:00 to 20:00) hours during the whole observation period and over the most crowed part of the square. Blue arrows indicate the entrances of the station.

**Figure 15.**(

**a**) Daily and (

**b**) hourly profile of the mean Voronoi pedestrian density inside the observation area. Morning time ranges from 07:00 to 10:00, whereas evening time ranges from 17:00 to 20:00.

**Table 1.**Results obtained from clustering of directions during morning (from 07:00 to 10:00) and evening hours (from 17:00 to 20:00) for the entire observation period. The 90° angle corresponds to 12:00. The numerosity indicates the number of pedestrians classified within each cluster. The average speed for each cluster is also reported in table.

Morning | Evening | ||||||
---|---|---|---|---|---|---|---|

Cluster | Angle (Degrees) | Numerosity | Speed (m/s) | Cluster | Angle (Degrees) | Numerosity | Speed (m/s) |

0 | 272 ± 26 | 606,213 | 0.77 ± 0.4 | 0 | 273.0 ± 27 | 503,053 | 0.78 ± 0.5 |

1 | 86.6 ± 33 | 295,622 | 0.65 ± 0.4 | 1 | 86.8 ± 34 | 342,903 | 0.64 ± 0.4 |

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Karagulian, F.; Liberto, C.; Corazza, M.; Valenti, G.; Dumitru, A.; Nigro, M.
Pedestrian Flows Characterization and Estimation with Computer Vision Techniques. *Urban Sci.* **2023**, *7*, 65.
https://0-doi-org.brum.beds.ac.uk/10.3390/urbansci7020065

**AMA Style**

Karagulian F, Liberto C, Corazza M, Valenti G, Dumitru A, Nigro M.
Pedestrian Flows Characterization and Estimation with Computer Vision Techniques. *Urban Science*. 2023; 7(2):65.
https://0-doi-org.brum.beds.ac.uk/10.3390/urbansci7020065

**Chicago/Turabian Style**

Karagulian, Federico, Carlo Liberto, Matteo Corazza, Gaetano Valenti, Andreea Dumitru, and Marialisa Nigro.
2023. "Pedestrian Flows Characterization and Estimation with Computer Vision Techniques" *Urban Science* 7, no. 2: 65.
https://0-doi-org.brum.beds.ac.uk/10.3390/urbansci7020065