## 1. Introduction

The 2015 outbreak of Middle East Respiratory Syndrome coronavirus (MERS-CoV) posed a serious public health threat in the Republic of Korea. It generated 186 confirmed cases including 38 fatal cases [

1]. It has drawn a lot of attention not only because it was the largest outbreak of MERS-CoV outside the Middle East regions but also because it has been claimed to be a super-spreading event (SSE) similar to the SARS outbreaks in 2003 [

2]. SSE is an outbreak where a small number of infected cases generated a large number of secondary infections [

3]; for example, 20% of infectious individuals are responsible for 80% of newly generated infections. Super-spreading events have been observed in the transmission dynamics of many infectious diseases including SARS transmission in Hong Kong and Singapore [

4]. Stein identified the complex interplay among hosts, pathogens and environments for super-spreading events in many infectious diseases [

5]. Furthermore, Shen et al. defined SSE as a patient transmitting a disease to at least eight contacts, based on an analysis of the SARS spread data [

6].

Concerning the 2015 MERS-CoV outbreak in Korea, previous studies have examined whether it was an SSE. Kucharski and Althaus analyzed a large transmission cluster of MERS-CoV in Korea to estimate the level of transmission heterogeneity and identified a substantial potential for super-spreading events [

7]. Also, Chun identified that MERS-CoV transmission in Korea was an SSE using a negative binomial model [

8]. Kim et al. characterized the transmission dynamics based on a mathematical model and investigated the time-dependent reproduction number [

9]. Gerardo et al. performed the comparative analysis between SARS and MERS; they claimed that MERS-CoV in Korea has shown the most significant heterogeneity in generating secondary cases [

2]. Nishiura et al. suggested that multiple visits to health facilities were the main cause of the disproportionate number of secondary cases and this might lead the outbreak to be an SSE [

10].

The high level of heterogeneity in generating secondary cases, which is one of the key evidence of an SSE, can be observed in the time series of cases in 2015 MERS-CoV outbreak in Korea. In

Figure 1, the time series of the 2015 MERS-CoV cases is displayed and classified using five super-spreaders: the five colors represent the different cases according to super-spreader (yellow color is unknown due to multiple contacts). It is clearly seen that there was significant heterogeneity in the 2015 MERS-CoV transmission: five super-spreaders (3%) infected 79% (147/186) of the total MERS-CoV cases. Transmission trees (infection tracing only) for the total 186 MERS-CoV cases are displayed in

Figure 2. The node size is proportional to the number of secondary cases (five super-spreaders are identified as in

Figure 1).

Having identified the MERS-CoV outbreak in Korea as an SSE, it is necessary to develop a mathematical model that can capture the heterogeneity of infectivity among patients. There have been various approaches to model the SSE in infectious disease transmission dynamics. Firstly, the standard compartmental models have been used. Mkhatshwa and Mummert developed a SIRP model by adding a new superspreader class to a standard SIR model [

11]. They assumed that superspreaders have a longer infectious period than normal infected individuals. While their model was simple and clear, a SIR-type model is inherently limited in capturing the individual heterogeneity of attributes, including contact numbers and infectivity. Another way of modeling SSE is a branching process model. In each discrete time step, an infected individual produces a random number of secondary infections that follows a certain probability distribution. There are a small number of cases whose value is far from the mean and these are used to model super-spreaders. Lloyd-Smith et al. employed a negative binomial distribution and defined an SSE where any single patient infects more than the

nth percentile of the distribution [

12]. James, Pitchford and Plank suggested three branching process models, each of whose random variable follows a different probability distribution [

13]. Garske and Rhodes suggested a branching process model implemented in continuous time instead of discrete generation [

14].

In addition to these methods, agent-based modeling (ABM) can be a useful tool in modeling SSEs. Due to the dramatic increase in computing power and its strength in modeling individual heterogeneity, it has been employed in modeling various aspects of infectious disease transmission dynamics especially in modeling SSEs with spatio-temporal factors. In ABM, SSEs are usually modeled by incorporating a small number of agents (super-spreaders) who can generate a larger number of secondary infections than normal agents. There are a few ways to implement these super-spreaders. The first way is to make them have an inherently higher level of infectivity [

14,

15]: this corresponds to the higher transmission rate in a mathematical model. Another way is to make the infectious period of super-spreaders longer while the infectivity is same with normal agents [

11,

14]. The last way is to make the number of contacts of super-spreader larger than the number of normal agents while the infectivity remains same [

16,

17]. Each of these ways pays attention to different aspects of the super-spreaders and can be useful for capturing and analyzing SSEs.

In particular, a group of studies has employed ABM in modeling SSEs from the perspective of the contact network. Fujie and Odagaki developed agent-based models on an irregular lattice network and suggested two kinds of models in their work; a strong infectiousness model and a hub model [

15]. Comparing the results of two models with SARS transmission data in Hong Kong, they concluded that the hub model provides a closer match with the real-world data. Small, Tse and Walker suggested a small-world network model where individuals have local links that connect to immediate neighbors and non-local links that connect to far-away nodes [

16]. They represented the former as contact with the family and the latter as contact made in public spaces. Bifolchi, Deardon and Feng simulated the epidemic spreads through various contact networks to determine which model best predicts the true probabilities of infection [

17]. Duan et al. suggested an elaborate agent-based model that incorporates a variety of features of agent behaviors that might generate SSE including heterogeneous levels of infectivity and contacts [

18].

The present research aims to develop stochastic agent-based models of the 2015 MERS-CoV transmission dynamics in South Korea. Embracing the power of ABM, heterogeneities among agents are implemented in the models. In particular, regular-spreaders and super-spreaders are set to have different transmission rate functions based on different transmission probabilities and a different number of contacts. Based on these heterogeneities, four models are suggested: no super-spreader model (No-SS), the super-spreaders model with higher infectivity (SSM1), the super-spreaders model with higher contacts (SSM2), and super-spreaders model with both higher infectivity and higher contacts (SSM3). Monte Carlo simulations are carried out under various epidemic scenarios and the results are compared with the real epidemic data of 2015 MERS-CoV in Korea. The distributions of the basic reproduction number and epidemic outputs are computed as well as the distribution of secondary cases under different models of super-spreaders. Furthermore, the effectiveness of isolation intervention strategies and their impact on the MERS-CoV transmission dynamics are investigated.

## 3. Simulation Results

All susceptible agents (either regular or super-spreaders) are located on an

$L\times L$ space, and their positions are randomly assigned at the beginning and fixed during the entire simulation (see

Figure 3). One infected agent (

the index case) is randomly located on the bottom of the simulated space (all other agents are S status). A Monte Carlo simulation is carried out with 1000 trials using the same set of parameter values. The averaged output of 1000 runs (or distributions) is presented from our numerical simulations. Baseline parameter values are given in

Table 1.

#### 3.1. The Impact of Super-Spreaders on Epidemic Outputs

The impact of different super-spreading models is investigated in terms of various epidemic outputs, including incidence, peak size, peak time and epidemic duration. First,

Figure 4 compares incidence profiles using two values of

$\lambda $ (a super-spreader proportion): left with

$\lambda =4\%$ and right with

$\lambda =10\%$. The average incidence of 1000 simulations is displayed as a solid smooth curve along with one particular incidence (a dashed stochastic curve).

Clearly, the effect of super-spreaders makes the disease transmission faster and more severe (see SSM3 in both panels) since super-spreaders in SSM3 have a stronger transmission and longer effective radius than the other models (in order of SSM3 > SSM2 > SSM1 > No-SS). This becomes more profound under a higher proportion of super-spreaders in the right panel ($\lambda =10\%$).

Next, the detailed epidemic distributions are illustrated in

Figure 5, which shows the peak size, peak time, and epidemic duration using

$\lambda =10\%$ with the mean represented by the dashed vertical line. The peak time is earlier and the peak size is larger in SSM3 (again in order of SSM3 > SSM2 > SSM1 > No-SS). This confirms that the impact of super-spreaders is significant on the MERS-CoV transmission dynamics. Also, the detailed epidemic outputs using two values of

$\lambda =4\%$ and

$\lambda =10\%$ are summarized in

Table 2.

Table 2 illustrates the mean and standard deviation under four models for three epidemic outputs (peak size, peak time, and epidemic duration).

Higher peak size, faster peak time, and earlier epidemic durations are the common features of SSE, which is consistent with the results under SSM3 as shown in the above epidemic outputs. This is due to the larger number of contacts and higher transmission rates resulting in the larger epidemic peak in a shorter period of time.

#### 3.2. The Impact of Super-Spreaders on the Basic Reproduction Number

The basic reproduction number is one of the most important quantities in mathematical epidemiology. It defines the average number of secondary infections in a completely susceptible population. One can obtain an analytic expression of the basic reproduction number (

${\mathcal{R}}_{0}$) for some compartment models [

19]. However, in general, there is no analytic expressions of

${\mathcal{R}}_{0}$ for agent-based models, hence, we employ the method for determining

${\mathcal{R}}_{0}$ described in [

20]. We compute the basic reproduction number from a randomly chosen infected individual in a completely susceptible population and obtain the distributions of 1000 simulations of secondary cases.

Figure 6 shows the distributions of

${\mathcal{R}}_{0}$ in the absence of super-spreaders (No-SS) and the mean is the dashed vertical line. Sensitivity analyses of

${\mathcal{R}}_{0}$ are conducted by varying baseline transmission rates and an effective radius. In our simulation, the effective radius has the most significant impact and higher transmission rates lead to larger

${\mathcal{R}}_{0}$.

Next,

Figure 7 illustrates the distributions of

${\mathcal{R}}_{0}$ under four super-spreader models (using four different transmission rate functions). It shows the impact of super-spreaders on

${\mathcal{R}}_{0}$ using two values of

$\lambda $ (super-spreader proportion): on the top with

$\lambda =4\%$ and the bottom with

$\lambda =10\%$. Since the basic reproduction number is computed only the first generation by the index case, the differences for

${\mathcal{R}}_{0}$ distributions of four models are smaller when

$\lambda =4\%$. However, the impact of super-spreaders in SSM3 becomes more significant when

$\lambda =10\%$, with the maximum mean around 5 of SSM3 (the means are in order of SSM3 > SSM2 > SSM1 > No-SS in the bottom panels). Please note that there exists a larger right tail in SSM3 (in order of SSM3 > SSM2 > SSM1 > No-SS). Obviously, this implies that there is a higher probability for a larger number of secondary cases to appear under SSM3.

#### 3.3. Further Sensitivity Analysis

The impact of three epidemic parameters is determined through incidence curves. First, the effect of the baseline transmission rate (

${\beta}_{0}$) on incidence has been investigated in

Figure 8. In general, disease transmission becomes faster and stronger as

${\beta}_{0}$ increases; the peak time is earlier and the peak height is larger in all four models. Next, the impact of an effective contact radius (

${r}_{0}$) has been explored in

Figure 9. There is a critical value of

${r}_{0}$ that determines the outbreak as pointed out in [

15]. There is no outbreak when

${r}_{0}\le 2$ and outbreaks occur when

${r}_{0}>2$. When

${r}_{0}$ reaches around 2.5, outbreaks take place in SSM2 and SSM3 and they get stronger in size and larger in speed as

${r}_{0}$ gets larger. It is straightforward that if patients can infect other individuals at a further distance, then disease will be transmitted faster. It accelerates the change of proportion of cumulative incidence by the different values of

${r}_{0}$.

Also, the proportion of super-spreaders (

$\lambda $) is varied and the incidence profiles were compared (results are not shown). As the proportion of super-spreaders gets larger, the peak size becomes larger and the peak time earlier in SSM2 and SSM3, while it does not have a remarkable impact in SSM1. This suggests that a longer effective radius makes the disease spread easier. Lastly, we investigated the effect of the exponent term

$\alpha $ on the transmission rate for super-spreaders. We vary the value of

$\alpha $ (the exponent term in

$\beta (r)$) for SSM1, SSM2 and SSM3. The transmission rate function decreases rapidly as

$\alpha $ increases (results are not shown here). This leads to the strength and the speed of disease transmission decreasing as

$\alpha $ increases. More details for these results can be found in [

21].

#### 3.4. The Impact of Super-Spreaders on Distributions of Secondary Cases

As observed in many infectious diseases, we explore the distributions of secondary cases under four models. Again, 1000 simulations for each model have been carried out and the mean distributions are obtained.

Figure 10 illustrates the mean distribution of the number of links under each model. In the absence of super-spreaders, No-SS shows monotone decreases in the number of links (the most homogeneous in four model outputs with the shortest tail from zero to six) (a). As the transmission rates become stronger and the number of contacts increases, the heterogeneity becomes more severe (in order of SSM3 > SSM2 > SSM1 > No-SS). As shown in (d), SSM3 exhibits the highest level of heterogeneity in the secondary cases (the highest frequency at zero and also the largest right tail). This is consistent with the results of the basic reproduction number, as shown in

Figure 7.

We have fitted the distribution with a power function

$(a{x}^{b})$, where

x is the number of secondary cases, and compared the slope of each fitted function to the 2015 MERS-CoV data. First, the left panel in

Figure 11 displays the distributions of secondary cases for the 2015 MERS-CoV (blue bar) and the fitted power law distributions (black circled curve). Next, the distributions of secondary cases are fitted as a power law using the results under four different models with

$\lambda =10\%$. The right panel compares four fitted power law distributions with the MERS-CoV data. Obviously, SSM3 shows the best fit to the distribution of secondary cases for the MERS-CoV data (compare the slopes of the green squared and black circled lines). Parameters (

a and

b in

$a{x}^{b}$) are estimated for the MERS-CoV data and the four models by the least-squares method. For the least-squares fitting procedure, we used the Levenberg—Marquardt method with line-search implemented in MATLAB (The Mathworks, Inc., Natick, MA, USA) in the built-in routine

lsqcurvefit which is part of the optimization toolbox. The resulting parameter estimates are listed in

Table 3.

#### 3.5. The Impact of Isolation Intervention Strategies

Lastly, we investigate the impact of isolation interventions on the MERS-CoV dynamics. Two distinct isolation strategies are implemented: a random and a targeted intervention. First, a random intervention isolates individuals (20% randomly chosen; both regular and super-spreaders) from the S and E classes at

$t=14$ day. Second, a targeted intervention isolates individuals (super-spreaders only starting from

$t=14$ day until the end of epidemic duration) from the E and I classes. Here, a targeted isolation intervention means an individual is isolated and either has effective contact with infected individuals (exposed) or is infected by super-spreaders. The results under the three models (SSM1, SSM2 and SSM3) are displayed in

Figure 12. In all three models, a random selection reduces the peak size proportionally, resulting in a longer epidemic duration (red incidence curves in (a), (b), (c)).

Please note that a targeted isolation reduces the peak size significantly at the beginning which can delay the highest peak and result in more preparation time (green incidence curves in (b), (c)). The implementation of targeted isolation while incapable of eliminating a possible outbreak, still manages to reduce the magnitude of the epidemic peak by distributing both the infection cases and hospitalization over a broader window of time. Distributing the MERS-CoV case burden over long windows in time is highly desirable when resources are limited.

A targeted intervention is more effective in both SSM2 and SSM3, while a random intervention is more effective in SSM1. It is worth noting that the effectiveness of interventions depends on the infected-network structures. These results suggest that the effectiveness of interventions depends on the characteristics of the MERS-CoV transmission dynamics. Hence, it highlights the roles of super-spreaders in the transmission dynamics of infectious disease and planning future interventions.

## 4. Discussions

In this study, we built a stochastic agent-based model on the 2015 MERS-CoV outbreak in Korea from the perspective of SSE. The outbreak was an SSE: almost 80% of 186 confirmed cases were infected by five super-spreaders (147/186). Surprisingly, one super-spreader alone infected 53% of the MERS-CoV cases via other super-spreaders (79/147). To devise effective intervention strategies and countermeasures for future emerging infectious diseases, it is necessary to clarify the generating mechanism of SSEs. In this regard, agent-based modeling is a useful tool for incorporating individual heterogeneity into the epidemic model. The present research developed and analyzed agent-based models with different assumptions concerning the feature of super-spreaders.

Our results indicate that the connectivity of individuals and a higher level of infectivity are the most significant factors for the MERS-CoV transmission dynamics. The effect of super-spreaders was not remarkably significant in SSM1, but it was in SSM2 and SSM3. The higher baseline transmission rate and the longer contact range produced a more rapid and severe disease spread in SSM2 and SSM3 than in SSM1. Due to the features of super-spreaders, the real-world data demonstrated the best fit with SSM3 compared to SSM1 or SSM2. The impact of the effective radius on the basic reproduction number ${\mathcal{R}}_{0}$ was the most significant. Also, higher transmission rates led to the larger ${\mathcal{R}}_{0}$.

Furthermore, the impact of super-spreaders on the distributions of ${\mathcal{R}}_{0}$ under SSM3 became more significant as a proportion of super-spreaders increased (the largest right tails in SSM3), which was consistent with the distributions results of secondary cases. SSM3 showed the highest level of heterogeneity in the distributions of ${\mathcal{R}}_{0}$ and secondary cases while No-SS gave the least heterogeneity. All results above suggest that SSE can be more properly characterized by a small number of infectious individuals who can infect others by means of a larger amount of contact with others combined with higher infectivity. Having only one feature (either higher infectivity or higher connectivity) is not strong enough to generate such a high level of heterogeneity in secondary cases.

The effectiveness of two isolation strategies was examined: a random isolation reduced incidence proportionally while a targeted isolation reduced the peak size and delayed the peak time. A targeted isolation was more effective in both SSM2 and SSM3 while a random isolation was more effective in SSM1. These results suggest that the effectiveness of intervention strategies is dependent on the structure of the infection network. However, super-spreaders can be identified in retrospectively; it is very challenging to predict SSEs or super-spreaders. Thus, a targeted isolation intervention might be idealistic to be implemented. Nevertheless, we can investigate the most critical factors for generating SSEs and these factors can provide how to devise effective intervention strategies as discussed in [

5,

22]. For instance, various factors include co-infection with another pathogen, immune suppression, changes in airflow dynamics, delayed hospital admission, misdiagnosis, and inter-hospital transfers. These factors are worthy of investigation and should be incorporated into targeted intervention strategies. Our results showed that even when early detection was unsuccessful, it is still effective to find possible candidates for super-spreaders and isolate them to reduce potential risks of severe outbreaks.

One of the limitations of this research lies in the shape of the space where the simulations were executed. Although it is well known that infectious disease transmission dynamics depends on the structure of the infection network, we based our model on a random network in order to focus on the effect of super-spreaders on disease transmission. Despite this simplification, our results from a simple random network clearly demonstrated the role of super-spreaders on disease transmission and the possible effective intervention scenarios. Further studies are expected to conduct epidemic models based on other network structures, especially more realistic ones. For example, using the empirical contact network of the 2015 MERS-CoV transmission in South Korea would make the simulated space more realistic. Also, the effectiveness of more various intervention strategies and possible epidemic scenarios can be explored in future research.