Next Article in Journal / Special Issue
On the Stock–Yogo Tables
Previous Article in Journal
Econometrics and Income Inequality
Previous Article in Special Issue
Econometric Fine Art Valuation by Combining Hedonic and Repeat-Sales Information
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Estimation of Treatment Effects in Repeated Public Goods Experiments

1
School of Economics, Shandong University, Shandong 250100, China
2
School of Economic, Political and Policy Sciences, University of Texas at Dallas, Richardson, TX 75080, USA
*
Author to whom correspondence should be addressed.
Submission received: 16 February 2018 / Revised: 9 October 2018 / Accepted: 26 October 2018 / Published: 29 October 2018
(This article belongs to the Special Issue Celebrated Econometricians: Peter Phillips)

Abstract

:
This paper provides a new statistical model for repeated voluntary contribution mechanism games. In a repeated public goods experiment, contributions in the first round are cross-sectionally independent simply because subjects are randomly selected. Meanwhile, contributions to a public account over rounds are serially and cross-sectionally correlated. Furthermore, the cross-sectional average of the contributions across subjects usually decreases over rounds. By considering this non-stationary initial condition—the initial contribution has a different distribution from the rest of the contributions—we model statistically the time varying patterns of the average contribution in repeated public goods experiments and then propose a simple but efficient method to test for treatment effects. The suggested method has good finite sample performance and works well in practice.
JEL Classification:
C91; C92; C33

1. Introduction

Experimental data from the laboratory have unique features. The most well known statistical benefit from laboratory experiments is the randomized outcome: by selecting subjects randomly, the difference between the treated and controlled groups can be directly measured. However, data from laboratory experiments have properties that sometimes make the information contained in them difficult to extract. Decisions in lab experiments are often repeated in order to give the subjects time to gain familiarity with the environment and strategic context. As subjects learn, responses may change considerably between the first and final periods of play. With repetition, subjects’ responses are highly persistent and cross-sectionally correlated. The question is how to estimate the treatment effects from such complicated data.
This paper aims to provide a simple and novel estimation method to analyze the treatment effect in repeated public goods games with voluntary contribution mechanism (VCM). A VCM game is one of the most popular experimental games, and its use has increased exponentially in the social and natural sciences. Applications to economics, political sciences, marketing, finance, psychology, biology and behavioral sciences are common. In each round, individuals are given a fixed amount of tokens and asked to invest in either a private or group account. The invested tokens in the public account are multiplied by a factor—which is usually greater than unity—and then are distributed evenly to all individuals. The overall outcome or the fraction of tokens contributed to the public account becomes the prime objective of experimentalists.
Let y i t be the fraction of contributed tokens to a public account of the i-th subject at the round t . As Ledyard (1995) points out, there are some common agreements among experimental studies: (i) the total amount of tokens does not influence on the overall outcomes. Hence, rather than the total amount of contributed tokens, the fraction of contributed tokens, y i t , becomes of interest. In other words, y i t is a bounded series between zero and unity; (ii) the average of initial contributions to a public account is around half of endowed tokens; (iii) as a game repeats, the cross-sectional average of contributions decreases usually over rounds; (iv) Even though an initial contribution, y i 1 , is not cross-sectionally dependent—since subjects are randomly selected—, as a game repeats, y i t is dependent on the previous group account. In other words, y i t is cross-sectionally correlated; (v) Lastly, y i t is serially dependent as well.
Another interesting feature is the existence of different subject types. Commonly assumed types are free-riders, selfish contributors (or strategists), reciprocators, and altruists. Usually, it is not feasible to identify each type statistically because subjects change their own types over rounds. Instead, following Ambrus and Greiner (2012), we attempt to approximate heterogeneous behaviors econometrically into three groups: increasing, decreasing and fluctuating. The increasing and fluctuating groups are obtained by changing parameters in the decreasing model, which we propose in this paper. If experimental outcomes are generated from a single group, the cross-sectional dispersion of the outcome should not diverge over rounds. We will discuss how to test the homogeneity restriction by using the notion of weak σ —convergence developed by Kong et al. (2018).
The purpose of this paper is to provide a simple but efficient method to analyze treatment effects by utilizing stylized facts. To achieve this goal, we first build a simple time series model with a nonstationary initial condition: the distribution of the first round outcomes is different from the rest of outcomes. The nonstationary initial condition model generates a temporal time decaying function. When experimental outcomes are generated from a decreasing (or increasing) group, the unknown mean of the initial outcome and decay rate for each experiment can be measured directly from estimating the following trend regression in the logarithm of the cross-sectional averages of individual experimental outcomes, y i t .1 That is,
log 1 N i = 1 N y i t = log μ + log ρ t 1 + v N , t ; i = 1 , , N ; t = 1 , , T ,
where i and t index subject and round, log μ denotes the true log mean of the initial outcome, and log ρ represents the log decay rate of the repeated experimental outcomes. After obtaining the estimates of μ and ρ by running a regression of (1) for each experiment, the overall outcome in the long run, π , can be estimated by
π ^ = μ ^ / 1 ρ ^ .
The remainder of the paper is organized as follows: Section 2 presents empirical examples in detail in order to motivate key issues and identify statistical problems. Section 3 builds a new statistical model for repeated experiments. This section also explains how the trend regression in Label (1) and the measurement of treatment effects in Label (2) are created, and provides the justification for why such measures can be used in practice. Section 4 provides asymptotic properties of the trend regression and discusses how to measure and test the treatment effects. Section 5 presents empirical evidence establishing the effectiveness of the new regression and measures in practice. Section 6 examines the finite sample performance of the suggested methods. Section 7 provides conclusions. The appendix includes technical derivations.

2. Canonical Examples

Throughout the paper, we use two sets of experimental data from Croson (1996) and Keser and Van Winden (2000). The design of these experiments are very similar to each other. Both papers aim to test the difference between Strangers Group and Partners Group in a repeated public goods game. In a standard repeated public goods game, several subjects play the game together as a group and the game is usually repeated for T rounds. The size of the group (number of subjects in a group) is denoted as G. At the beginning of each round, every subject is given some endowment tokens e to divide between a public account and a private account. Each token contributed to the public account will be multiplied by M and then divided among all members in the group. In this environment, the payoff function at each round for the subject i can be written as π i = e g i + ( M / G ) j = 1 G g j , where g i is the amount subject i contributes to the public account. M / G is called marginal per capita return (MPCR). If subjects are rematched randomly into groups for each iteration of the game, then it is a Strangers Group. If the composition of the group does not change through all the rounds, then it is a Partners Group.
Since Andreoni (1988) found that the average contribution from Strangers group was relatively larger than that from Partners group, there are lots of papers investigating the difference between Strangers and Partners group, but the results are inconclusive. Andreoni and Croson (2008) summarize the results from many replications and studies that have explored this question: ‘Nonetheless, this summary of results does little to clear up the picture. In all, four studies find more cooperation among Strangers, five find more by Partners, and four fail to find any difference at all.’ The answer to whether partners group or strangers group contribute more is really mixed. We will re-examine the treatment effect of rematching in this paper. Among all the replications of Partners vs. Strangers Game, the experiments from Croson (1996) and Keser and Van Winden (2000) are identical except for the number of rounds. Interestingly, both papers found that the average contribution from Partners group was higher than that from Strangers group. Table 1 describes the data and various features of the experiments, and Figure 1 shows the average contribution for each round in Croson (1996). Note that it seems to be straightforward to compare two samples in Figure 1 if all samples are independently distributed. In many experimental studies, however, this assumption is violated: usually, y i t is cross-sectionally and serially correlated. Under this circumstance, a conventional z-score test with y i t for each t , or with time series averages of y i t becomes invalid unless the cross sectional dependence is considered properly. For the same reason, other statistics used in practice become invalid as well.2
For each study, controlled and treated groups may be defined and then the treatment effects can be measured. For all cases, the null hypothesis of interest becomes no treatment effect or the same overall outcomes given by
H 0 : E π c = E π τ ,
where π c and π τ are controlled and treated overall outcomes, respectively. Under the null hypothesis, the overall outcomes become the same regardless of the number of rounds.
The overall outcomes can be measured by the area below of the cross sectional averages. Let π s , t be the true, but unknown decay function for the sth experiment. Then, the overall outcome can be defined as π s = 0 π s , t d t . 4 When the treated experimental outcome at time t is always greater or equal to the controlled one for all t , that is, π τ , t > π c , t , then the overall treatment effects, Π = π τ π c , becomes positive. However, when π c t is crossing over π τ , t , it is hard to judge whether or not the treatment effects become positive or negative. Figure 2 demonstrates such an artificial case where the null hypothesis is not easy to be tested.5 Panel A in Figure 2 shows the hypothetical average of the controlled and treated outcomes for each round. The decay rate for the controlled group is faster than that for the treated group, but the initial average outcome for the controlled group is higher than that for the treated group. The overall effects can be measured by the areas below the average of the outcomes which are displayed in Panel B in Figure 2. Evidently, the treatment effect is depending on the number of rounds. The treatment effect becomes negative, zero and positive when the final round T becomes T < 11 , T = 11 , and T > 11 .

3. Statistical Model

There is one important and distinctive feature of the repeated lab-experimental data: a nonstationary initial condition. This unique feature is rarely seen in conventional data, and, more importantly, the estimation without accounting for this feature is invalid, which consequently leads to wrong judgements. This section provides a new statistical model for this nonstationary initial condition.

3.1. Statistical Modelling Nonstationary Initial Condition

Here, we statistically approximate the decay rate in Figure 1 and Figure 2 as an exponential decay function. We do not aim to model or explain the unknown decay function theoretically but rather approximate it. In other words, we model it statistically. In repeated experiments, subjects are randomly selected but exhibit common behavior. Typically, responses decrease over successive rounds as shown in Figure 1. Interestingly, such common behavior, or highly cross-sectionally correlated behavior, can be generated if the true mean of the initial contributions is different from the mean of the contributions in the long run. Note that, in this subsection, we will ignore the subscript s for notational convenience.
Define y N , t to be the cross-sectional average of y i t at time t . When t = 1 , we let
y N , 1 = μ + ϵ n , 1 .
Meanwhile, t > 1 , we assume that the cross-sectionally aggregated series follows AR(1) process:
y N , t = a 1 ρ + ρ y N , t 1 + ϵ N , t , for t 2 ,
where a can be treated as the long run value of y N , t in the sense that
lim t E y N , t = a 1 ρ + ρ lim t E y N , t 1 ,
so that
lim t E y N , t = a ,
since lim t E y N , t 1 = lim t E y N , t . Usually, the expected value of initial outcome is not the same as the long run outcome. That is, μ a . Statistically, the transitory behavior of y N t due to the nonstationary initial condition in (4) can be modeled by
y N , t = a + μ a ρ t 1 + e N , t , for t = 1 , , T ,
where the error follows an AR(1) process with the nonstationary initial condition.
e N , t = ρ e N , t 1 + ϵ N , t with e N , 1 = ϵ N , 1 .
If the initial mean, μ , is different from the long run mean, a , then y N t have a time decay function ρ t 1 as it is shown in (8).
One of the most interesting features of the repeated games is the convergent behavior. As t approaches the last round, the fraction of free riders increases and, for some games, this fraction becomes one.6 That is, the average outcome, y N , t , is converging to the long run equilibrium. This implies that the variance of the random error should be decreasing over time as well. Otherwise, the fraction of free riders does not converge to unity as t increases. To reflect this fact, we assume that the variance of e N , t is shrinking at the rate ρ t 1 over time.7
e N , t = u N , t ρ t 1 for 0 ρ 1 .
This implies that at t = 1 , ϵ N , 1 = u N , 1 . Since y N , t is bounded between zero and one, the initial error term, u N , 1 , should be bounded between μ and 1 μ . We assume that this restricted condition holds with t > 1 as well. That is, u N , t is independently distributed but bounded between μ and 1 μ . Statistically, we write as
u N , t i d B 0 , σ 2 μ 1 μ ,
where i d stands for ‘independently distributed’, B implies boundedness, and the superscript and subscript show the upper and lower bounds.

3.2. Features of New Statistical Model

In the previous subsection, we assume that the nonstationary initial condition model can approximate well the cross-sectional average of individual outcomes. In this subsection, we consider the possibility that the aggregated model in (8) through (10) holds for all subjects. That is, we further assume that the stochastic process for y i t can be written as
y i t = a i + μ i a i ρ t 1 + e i t for e i t = u i t ρ t 1 , and u i t i d B 0 , σ i 2 μ i 1 μ i ,
where the experimental outcome y i t is normalized by the maximum number of tokens so that 0 y i t 1 always for all i and t . The long run value of y i t is a i , μ i is the unknown initial mean of y i 1 , ρ is the decay rate and e i t is the approximation error. Note that u i t is a bounded, but independent random variable with mean zero and variance σ ˜ i 2 , or u i t i B 0 , σ ˜ i 2 μ i 1 μ i . As t increases, the unconditional mean of individual outcome, y i t , converges a i which can be interpreted as a long run steady state.
If the assumption in (11) holds, then the simple stochastic process for y i t may explain the temporal cross-sectional dependence. When t = 1 (initial round), the individual outcome, y i 1 , is not cross sectionally dependent since subjects are randomly selected. However, when t 2 , the individual outcome becomes cross-sectionally dependent due to the time varying mean part, μ i a i ρ t 1 . The common factor model captures the kind of behavior described in Ashley et al. (2010): the previous group outcome affects the experimental outcome for each subject. To see this, consider the fact that the group means excluding the ith subject outcome, G 1 1 j i G y i t , is similar to the group mean of G 1 i = 1 G y i t . In addition, the group mean is similar to the cross sectional average: G 1 i = 1 G y i t N 1 i = 1 N y i t = μ N ρ t 1 + e N , t where μ N and e N , t are the cross-sectional averages of μ i and e i t . Note that with large N, e N , t is close to zero, so that this term can be ignored. Statistically e N , t = O p N 1 / 2 for any t . Hence, the subject’s outcome at time t can be rewritten as
y i t = μ i μ N ρ × μ N ρ t 2 + e i t + O p N 1 / 2 .
The first term represents the marginal rate of the relative contribution of each subject, and the second term is the group average in the previous round, and the last two terms are random errors.
Nonetheless, we do not attempt to model the implicit behaviors of individuals, such as selfish, reciprocal and altruistic behaviors. Instead, we model the time varying patterns of the outcome (contribution to group account) statistically. By following Ambrus and Greiner (2012), we approximate heterogeneous behaviors of subjects econometrically into three groups: increasing, decreasing and fluctuating. The increasing group includes two subgroups of subjects: subjects contribute all tokens to the public account for all rounds or contribute more tokens over successive rounds. Similarly, the decreasing group includes the two subgroups of subjects: Subjects do not contribute at all, or subjects contribute fewer tokens over rounds. Lastly, a fluctuating or confused subject is neither in the decreasing or increasing group.
In public goods games, if the long run Nash equilibrium occurs (or if a i = 0 and 0 < ρ < 1 ), the fraction of free riders increases over rounds. Meanwhile, if a i = 1 and 0 < ρ < 1 , then the unconditional mean of the individual outcome, y i t , converges to the Pareto optimum a i = 1 . Furthermore, except for the initial round, the experimental outcomes are cross-sectionally dependent so that individual’s decision depends on the overall group outcome. Lastly, if the decay rate becomes unity or ρ = 1 , then the individual outcome becomes purely random.
To be specific, we can consider the following three dynamic patterns:
y i t = μ i ρ t 1 + e 1 , i t if i G 1 , or a i = 0 and 0 < ρ < 1 , μ i + e 2 , i t if i G 2 , or ρ = 1 , 1 1 μ i ρ t 1 + e 3 , i t if i G 3 , or a i = 1 and 0 < ρ < 1 .
For groups G 1 and G 3 , the contribution is decaying each round at the same rate, ρ . If a subject belongs to G 1 , the contribution is decreasing over rounds and, if she belongs to G 3 , the contribution is increasing. The remaining subjects are classified together in a ‘fluctuating’ group. For non-fluctuating groups ( G 1 and G 3 ), the random errors can be rewritten as e s , i t = u s , i t ρ t 1 for s = 1 , 3 . Hence, as round increases, the variance of e s , i t decreases and eventually converges to zero. Meanwhile, the random error for G 2 is not time varying since ρ is always unity.
Let N 1 be the number of subjects in G 1 and n 1 = N 1 / N . Similarly, we can define N 2 and N 3 as the numbers of subjects in G 2 and G 3 , and n 2 and n 3 as their fractions, respectively. The sample cross-sectional averages for each round under homogeneity (only G 1 exists) and heterogeneity can be written as
1 N i = 1 N y i t = μ ρ t 1 + e N t if all i G 1 , τ + φ ρ t 1 + e N t if some i G 1 ,
where τ = n 2 μ + n 3 , φ = μ τ , and
e N t = N 1 i = 1 N e i t + ρ t 1 N 1 i = 1 N μ i μ , e N t = ρ t 1 N 1 i = 1 N μ i μ + 1 ρ t 1 N 1 i = 1 N 2 μ i μ + N 1 i = 1 N e i t .
Therefore, the time varying behavior of the cross-sectional averages under homogeneity become very different from those under heterogeneity. First, under homogeneity, the average outcome decreases each round and converges to zero. However under heterogeneity, depending on the value of φ , the average outcome may decrease φ > 0 , increase φ < 0 or does not change at all φ = 0 over rounds. Second, the variance of the random part of the cross-sectional average, e N t , under homogeneity is much smaller than that under heterogeneity. In other words, the cross-sectional averages under heterogeneity become much more volatile than those under homogeneity. Third, both random errors, e N t and e N t , are serially correlated. However, the serial correlation of the random errors under homogeneity, E e N t e N t s , goes away quickly as s increases, meanwhile the serial correlation under heterogeneity, E e N t e N t s never goes to zero even when s as long as n 2 0 .
It is not hard to show that under heterogeneity the cross-sectional variance of y i t diverges over rounds. In this case, the estimation of the treatment effects is not straightforward since τ and φ in (13) cannot be identified in some cases.8 We leave the case of heterogeneity for future work, and mainly focus on the estimation of the treatment effects under homogeneity. The homogeneity assumption is testable. One may test the convergence directly by using Phillips and Sul (2007) or Kong et al. (2018). Particularly the weak σ —convergence test proposed by Kong et al. (2018) is more suitable for the convergence test since the relative convergence test by Phillips and Sul (2007) requires somewhat more restrictive conditions than the weak σ convergence test by Kong et al. (2018).
The contributions to the public account up to the last round T can be measured by the following statistic.
t = 1 T 1 N i = 1 N y i t = μ N 1 ρ T 1 ρ + e N , T ,
where
e N , T = t = 1 T 1 N i = 1 N e i t = t = 1 T 1 N i = 1 N u i t ρ t 1 = O p N 1 / 2 ,
since the sum of ρ 2 t 2 over T is O 1 . 9 The estimation of the overall treatment can be done simply by comparing the two means, provided the subjects in the two experiments are independently and randomly selected. Let N = min N c , N τ , where N c and N τ are the total numbers of subjects for the controlled and treated experiments, respectively. Then, from (11), the probability limit of the average difference becomes the treatments effect (TE) given by
T E = plim N t = 1 T 1 N c i = 1 N c y c , i t t = 1 T 1 N τ i = 1 N τ y τ , i t = μ c 1 ρ c T 1 ρ c μ τ 1 ρ τ T 1 ρ τ ,
where μ c and μ τ are the cross-sectional mean of the initial contributions. When μ c = μ τ , the overall treatment effect with the slower decay rate is higher than the treatment effect with the faster decay rate. When two experimental outcomes cross over each other, the overall effect becomes ambiguous. Mathematically, this case can be written as μ c > μ τ but ρ c < ρ τ . Suppose that, with a fixed T (say T = 10 ) , the overall effect in the crossing-over case is identical between the two experiments. Then, such a result does not become robust since, with more repetitions, the experimental outcome with the slower decay rate must be higher than that with the faster decay rate. Hence, for a such case, the following asymptotic treatment effect can measure the two outcomes robustly and effectively:
Asy . TE = Π = lim T T E = μ c 1 ρ c μ τ 1 ρ τ = π c π τ , let s say .

3.3. Estimation of Overall Treatment Effects

The experimental outcomes will converge to zero when the dominant strategy is Nash bargaining.10 It is better to use this convergence result in the estimation of the asymptotic treatment effect. That is, the percentage of free-riding is assumed to become one as the total number of rounds increases.11 If there are multiple dominant strategies in a game, then the experimental outcomes do not need to converge to a certain value. In this case, the outcome for each round may be volatile and, more importantly, the cross-sectional variance will increase over rounds.
The estimation of the asymptotic treatment effects requires the estimation of two unknown parameters; μ and ρ . There are two ways to estimate both parameters jointly: nonlinear least squares estimators (NLS) and log transformed least squares estimation with the cross-sectionally averaged data. Below, we first discuss the potential problems associated with NLS estimation, and then go on to show the simple log trend regression.
Nonlinear Least Squares (NLS) Estimation
To estimate μ , one may consider estimating μ i first by running the following NLS regression for each i:
y i t = f μ i , ρ + e i t = μ i ρ t 1 + e i t , let s say .
It is well known that the NLS estimators for μ i and ρ are inconsistent as T because t = 1 f μ i , ρ m t μ i + , ρ + 2 = M < for μ i μ i + and ρ ρ + . See Malinvaud (1970) and Wu (1981) for more detailed discussions. However, there is a way to estimate μ and ρ by using cross sectional averages. Consider the following regression with cross-sectional averages
y N , t = μ N ρ t 1 + e N , t .
For a given T , as N , i = 1 N t = 1 T μ i ρ t 1 μ i + ρ + t 1 2 for μ i μ i + and ρ ρ + . Therefore, both μ N and ρ can be estimated consistently as N . Denote the resulting estimators as μ ^ nls and ρ ^ nls . We will discuss the underlying asymptotic theory in the next section, but, in the meantime, we want to emphasize here that the fastest convergence rate for ρ ^ nls is N even when T , N jointly.
Logarithm Transformation
Alternatively, the nonlinear regression in (16) can be transformed into a linear regression by taking the logarithm in y N , t .12 Observe this:
log y N , t = log μ N + log ρ t 1 + log 1 + e N , t μ N ρ t 1 .
From (9), the last term can be rewritten as
log 1 + e N , t μ N ρ t 1 = log 1 + u N , t μ N = u N , t μ N + O p N 1 = v N , t + O p N 1 ,
where u N , t = N 1 i = 1 N u i t . Hence, the following simple trend regression can be used to estimate log μ N and log ρ :
log 1 N i = 1 N y i t = log μ N + log ρ t 1 + v N , t ; t = 1 , . . . , T .
Taking the exponential function of log μ N ^ and log ρ ^ provides consistent estimators for μ ^ N and ρ ^ , respectively. The asymptotic properties of the trend regression in (19) will be discussed in detail shortly, but here note that the convergence rate for ρ ^ is much faster than ρ ^ nls . However, this does not always imply that ρ ^ is more accurate than ρ ^ nls , especially with small N . It is possible that the minimum value of the cross-sectional averages of y i t can be near-zero or zero. In this case, the dependent variable in (19) is not well-defined.

4. Asymptotic Theory

This section provides the asymptotic theory for the proposed estimators in (16) and (19). We make the following assumptions.
Assumption A: Bounded Distributions
(A.1) 
μ i is independent and identically distributed with mean μ and variance σ μ 2 , but it is bounded between 0 and 1: μ i i i d B μ , σ μ 2 0 1
(A.2) 
e i t = u i t ρ t 1 for 0 ρ 1 where u i t is independently distributed with mean zero and variance σ i 2 , but it is bounded between μ i and 1 μ i . That is, u i t i d B 0 , σ i 2 μ i 1 μ i .
Assumption B: Data Generating Process
The data generating process is given by y i t = μ i ρ t 1 + e i t .
Under Assumptions A and B, it is easy to show that, if 0 ρ < 1 , all subjects’ outcomes become zero in the long run:
lim t T y i t = 0 for all i .
In other words, y i t converges to zero. Assumption A.2 implies no serial dependence in u i t for the sake of simplicity. We will show later that we cannot find any empirical evidence suggesting that u i t is serially correlated. In addition, see Remark 5 and 6 for more discussion of the ramifications of the violation of Assumption A.2.
Define the nonlinear least squares estimator in (16) as the minimizer of the following loss function:
arg min μ N , ρ t = 1 T y N , t μ N ρ t 1 2 .
Meanwhile, the ordinary least squares estimator in (19) is defined as
log μ ^ N log ρ ^ = t = 1 T 1 t = 0 T 1 t t = 0 T 1 t t = 0 T 1 t 2 1 t = 1 T log y N , t t = 1 T t 1 log y N , t .
Since the point estimates are obtained by running either nonlinear or linear time series regression, the initial mean, μ i , for each subject is not directly estimated, but the cross-sectional mean, μ N , is estimated. Define μ ˜ N as an estimator for μ N . Then, the deviation from its true mean μ can be decomposed as
μ ˜ N μ = μ ˜ N μ N + μ N μ .
Since μ N μ = O p N 1 / 2 , μ ˜ N μ also becomes O p N 1 / 2 as long as μ ˜ N μ N = O p N 1 / 2 T κ for κ > 0 . There are two ways to derive the limiting distribution for μ ˜ N , depending on the definition of the regression errors or the definition of the regression coefficients. First, the error terms can be defined as in (16) and (19). Then, the limiting distribution of μ ˜ N μ N can be derived, which is the first term in (21). After that, the limiting distribution of μ ˜ N μ can be obtained by adding the limiting distribution of μ N μ to that of μ ˜ N μ N .
Second, the error terms or the coefficient terms may be re-defined as follows:
y N , t = μ ρ t 1 + ε N , t , for ε N , t = u N , t + μ N μ ρ t 1 ,
log y N , t = log μ + log ρ t 1 + v N , t for v N , t = v N , t + μ N μ / μ .
Of course, the limiting distributions of the estimators in (16) and (19) are identical to those in (22) and (23). Nonetheless, it is important to address how to estimate the variance of μ i consistently. We will discuss this later but now provide the limiting distributions for the estimators of μ and ρ in (22) and (23) here first. Let μ ^ = exp ( log μ ^ N ) and ρ ^ = exp ( log ρ ^ ) , where log μ ^ N and log ρ ^ are LS estimators of the coefficients given in (23).
Theorem 1.
Limiting Distributions
Under Assumption A and B, as N , T jointly,
(i) the limiting distributions of the NLS estimators in (22) are given by
N μ ^ n l s μ N ρ ^ n l s ρ d N 0 0 , Ω 11 Ω 12 Ω 12 Ω 22 ,
where
Ω 11 = 3 ρ 2 + ρ 4 5 ρ 6 + 1 1 + ρ 2 3 σ 2 + 1 + 2 ρ 2 2 ρ ρ 3 1 + ρ 2 σ μ 2 , Ω 12 = ρ 1 ρ 2 2 1 + 3 ρ 2 σ 2 μ 1 + ρ 2 3 1 + ρ 2 ρ 3 1 ρ 1 ρ 2 μ σ μ 2 ρ , Ω 22 = 2 1 ρ 2 3 ρ 2 μ 2 1 + ρ 2 3 σ 2 + 1 ρ 2 1 ρ 2 2 ρ 2 σ μ 2 μ 2 .
(ii) The limiting distributions of the LS estimators in (23) are given by
N μ ^ μ N T 3 / 2 ρ ^ ρ d N 0 0 , Σ w h e r e Σ = σ μ 2 0 0 12 ρ 2 σ 2 / μ 2 .
See Appendix A for the detailed proof of Theorem 1. Here, we provide an intuitive explanation of the results, especially regarding the convergence rate. The convergence rate of the NLS estimators is N even under the condition of N , T jointly. The underlying reason is as follows. The first derivative of μ ρ t 1 with respect to ρ , f ρ , t , is μ t 1 ρ t 2 , so that t = 1 T f ρ , t 2 = O 1 since lim t t ρ t = 0 . However, the order in probability of the regression errors is O p N 1 / 2 , which determines the convergence rate of ρ ^ nls . Meanwhile, the LS estimators with the logged cross-sectional averages in (23) are free from this problem. That is why the convergence rate of the estimator for log ρ is N T 3 / 2 . From the delta method, ρ ^ ρ = ρ ( log ρ ^ log ρ ) + O p N 1 T 3 . Hence, the convergence rate of ρ ^ ρ is also N T 3 / 2 . As we discussed under (21), the convergence rate of μ ^ is totally dependent on that of μ N μ , which is N . We will show later by means of Monte Carlo simulation that the LS estimators have better finite sample performance than NLS estimators.
Here are some important remarks regarding the properties of the considered estimators.
Remark 1.
Limiting Distributions for Fixed T
Define θ ^ n l s = μ ^ n l s ρ ^ n l s , θ ^ = μ ^ ρ ^ and θ = μ ρ . Under Assumption A and B, as N with fixed T , the limiting distributions are given by
N θ ^ n l s θ d N 0 , Ω T , N θ ^ θ d N 0 , Σ T .
The variances Ω T and Σ T are finite constants but are not expressed explicitly since their formulae are very complicated. To evaluate and compare the two variances analytically, we set T = 10 and ρ = 0 . 9 .13 With these values, the relative variance, Ω T / Σ T , becomes a function of the relative variance of σ μ 2 / σ 2 . Figure 3 displays the relative variance of Ω T / Σ T to σ μ 2 / σ 2 . Obviously, the LS estimators are more efficient than the NLS estimators because the variance of the LS estimator is always smaller than that of the NLS estimator for all σ μ 2 / σ 2 0 . Interestingly, as σ μ 2 / σ 2 increases, Ω T / Σ T also increases.
Remark 2.
Consistent Estimator of σ μ 2
There are several ways to construct consistent estimators for σ μ 2 by utilizing the LS estimator ρ ^ . To save space, we do not report all consistent estimators. By means of (unreported) Monte Carlo simulation, we find that the following estimator for σ μ 2 provides the smallest variance among several estimators. The individual fixed effects, μ i , can be estimated by running y i t on ρ ^ t 1 . Note that, for any t ,
ρ ^ t 1 = ρ t 1 + t 1 ρ t 2 ρ ^ ρ + O p N 1 T 3 ,
so that y i t can be rewritten as
y i t = μ i ρ ^ t 1 + e i t + f o r e i t + = e i t + t 1 ρ t 2 ρ ^ ρ μ i + O p N 1 T 3 .
Rewrite the LS estimator of μ ^ i in (26) as
μ ^ i = μ i + t = 1 T ρ ^ 2 t 2 1 t = 1 T ρ ^ t 1 e i t + .
By direct calculation, it is easy to show that this estimator is almost unbiased. Specifically,
E t = 1 T ρ ^ t 1 e i t + = μ i 2 μ 2 12 ρ 2 ρ 2 + ρ 4 1 ρ 2 3 N T 3 σ 2 .
Note that the convergence rate of μ ^ i over T is not T but O 1 since the denominator term, t = 1 T ρ ^ 2 t 2 , becomes O p 1 . However, the sample variance of μ ^ i estimates σ μ 2 consistently. As N , the probability limit of the sample mean of μ ^ i becomes
p l i m N 1 N i = 1 N μ ^ i = μ + 1 ρ 2 T 1 ρ 2 1 p l i m N 1 N i = 1 N t = 1 T ρ ^ t 1 e i t + = μ .
In addition, the probability limit of its variance is given by
p l i m N 1 N i = 1 N μ ^ i 1 N i = 1 N μ ^ i 2 = σ μ 2 .
Remark 3.
Limiting Distribution of Asymptotic Treatment Effects
Let the overall average individual outcome be
π ^ = μ ^ 1 ρ ^ .
To derive its limiting distribution, define R = 1 ρ 1 , μ 1 ρ 2 . Then, by using the delta method, the limiting distribution of π ^ can be obtained from (25) directly:
N π ^ π d N 0 , Ω π ,
where Ω π = R Σ R and Σ is defined in (25). The variance can be estimated by replacing the true parameters by the point estimates since
Ω ^ π p Ω π .
Further note that the average treatment effect between the controlled and treated outcomes, Π = π c π τ , can be estimated by Π ^ = π ^ c π ^ τ , and its limiting distribution is
N Π ^ Π d N 0 , Ω π , c + Ω π , τ .
Remark 4.
Heterogeneity of ρ
If the decay rate is heterogeneous across subjects, both parameters should be estimated for each i. However, as T , the NLS estimators are inconsistent, as we discussed before. In addition, the logarithm approximation for each subject’s outcome cannot be calculated for any i such that y i t = 0 for some t . Moreover, the cross-sectional average of y i t yields
N 1 i = 1 N y i t N 1 i = 1 N μ i N 1 i = 1 N ρ i t 1 + N 1 i = 1 N e i t ,
even when μ i is assumed to be independent from ρ i . To see this, let ρ i i i d ρ , σ ρ 2 and take Taylor expansion of ρ i t 1 around ρ :
ρ i t 1 ρ t 1 = t 1 ρ t 2 ρ i ρ + t 1 t 2 ρ t 3 ρ i ρ 2 + R ρ ,
where R ρ is the higher order term. Since the second term does not have zero mean, E ρ i t 1 = ρ t 1 + t 1 t 2 ρ t 3 σ ρ 2 + E R ρ ρ t 1 for t > 2 .
Remark 5.
Violation of Assumption A.2 and Serial Dependence in Error
The error term can be serially dependent if subjects do not form rational expectation. In addition, if Assumption A.2 does not hold, then the logarithm approximation in (17) and (18) does not hold. To accommodate such situations, we change Assumption A.2 as follows:
Assumption A.3:
e i t = u i t t β for 0 ρ 1 and β > 0 where u i t has a finite fourth moment over i for each t and follows autoregressive process of order 1. In particular u i t = ϕ u i t 1 + w i t for 0 ϕ ρ < 1 where w i t i i d 0 , σ i 2 , and lim N N 1 i = 1 N σ i 2 = σ 2 .
Under Assumption A.3, the convergence rate of e i t is slower than that of μ i ρ t 1 since t β ρ t 0 for all β > 0 and 0 < ρ < 1 . In this case, the error v N t should be re-defined as u N , t t β ρ 1 t / μ N . Therefore, the variance of v N t diverges to infinity since t β ρ t 1 as t . This problem can be avoided by considering only N asymptotics with fixed T . Under Assumption A.3, as N with fixed T , the limiting distribution is given by
N log μ ^ log μ N log ρ ^ log ρ d N 0 0 , Ω f o r Ω = ω 11 + σ μ 2 / μ 2 ω 12 ω 12 ω 22 .
The definition of Ω is given in Appendix B.
Remark 6.
Violation of Assumption A.2 under Local to Unity
Next, consider the local to unity setting which allows t β ρ T t as t . In this case, the faster convergence rate can be restored. We replace Assumption A.3 with
Assumption A.4:
u i t has a finite fourth moment over i for each t , and is weakly dependent and stationary over t . Define the autocovariance sequence of u i t as E u i t u i t + k = φ i k where k = 1 k φ i k < . Partial sums of u i t over t satisfy the panel functional limit laws
1 T t = 1 T r u i t d B i r a s T f o r a l l i ,
where B i is an independent Brownian motion with variance ω i i 2 over i . Moreover, the limit, lim N N 1 i = 1 N ω i i 2 = ω 2 < . The decay rate becomes a function of T . Particularly, ρ T = exp η T where ε < η < 0 for any small ε > 0 .
Under Assumption A.1, A.4 and B, as N , T jointly the limiting distribution is given by
N log μ ^ log μ N T 3 + 2 β s log ρ ^ log ρ d N 0 , Ω + , f o r Ω + = Ω 11 + 0 0 Ω 22 + ,
where Ω 11 + = σ μ 2 / μ 2 , Ω 22 + = 36 ω 2 μ 2 0 1 r 2 β 1 2 r 2 1 + 2 η r d r , ω 2 is the long run variance of u i t . See Appendix C for the detailed proof.

5. Return to Empirical Examples

As we discussed early, the estimation of the treatment effects based on (27) requires the pretesting for homogeneity. We use the weak σ convergence test proposed by Kong et al. (2018) to examine the homogeneity restriction. Here, we briefly discuss how to test the weak σ convergence. Define σ ^ t 2 as the sample cross-sectional variance of y i t at time t . That is,
σ ^ t 2 = 1 N i = 1 N y i t 1 N i = 1 N y i t 2 .
Next, run σ ^ t 2 on a constant and a linear trend
σ ^ t 2 = a + γ t + u t .
Denote the t-ratio of γ ^ as
t γ ^ = γ ^ / V γ ^ ,
where V γ ^ is the heteroskedasticity autocorrelation consistent (HAC) estimator for the long run variance of γ ^ . Kong et al. (2018) suggest to use Newey and West (1987)’s HAC estimator with the window length of int T 1 / 3 , where int · is an integer function.
Table 2 reports the number of pure altruistic subjects—those who contribute all tokens for all rounds—for each experimental setting, the homogeneity test, and the estimation of the trend regression. It is important to note that the weak σ convergence holds even when a few y i t does not converge into the cross-sectional average. For example, in Croson (1996)’s data, one subject in the Partner game always contributes all tokens, so that the fraction of pure altruistic subjects is around 4%. In addition to Keser and Van Winden (2000)’s data, one subject in the Stranger game always contributes all tokens as well. Since the fraction of the pure altruistic subjects is too small, the homogeneity test is influenced very little. The estimation of the treatment effects is influenced very little by an outlier as well. In all cases, the point estimates of γ become significantly negative statistically, which implies that there is only a single group of subjects asymptotically.
Table 2 also reports the point estimates of μ and ρ and their standard errors. These estimates can be used to construct the long run treatment effects, Π . In addition, note that the point estimates of μ and their standard errors can be used to test the treatment effect in the initial round. Here, we show how to calculate them. Consider Croson’s case as example. The treatment effect in the initial round becomes μ ^ p μ ^ s where the control group becomes Strangers, but the treated group becomes Partners. Then, the difference becomes
μ ^ p μ ^ s = 0 . 614 0 . 459 = 0 . 155 ,
but its standard error becomes
V μ ^ p μ ^ s = 0 . 08 2 + 0 . 081 2 = 0 . 114 .
Hence, the t-ratio is not big enough to conclude that the Partners game provides more contributions. Meanwhile, the initial treatment in Keser and Van Winden (2000)’s case becomes significant. The resulting t-ratio becomes 4.853 . Hence, the Partners game provides more contributions in the first round.
Next, we estimate the long run treatment effects, and the null hypothesis of the same asymptotic treatment of Strangers and Partners:
H 0 : π p π s = Π = 0 ,
where π s and π p stand for the asymptotic contribution from Strangers and Partners groups, respectively. Note that the asymptotic or long run contribution can be calculated by summing up all contributions across rounds as it shown in Figure 2.
The estimates of π s and π p are given in Table 3. As it is shown in Figure 1, in both cases, the asymptotic contribution from Partners game ( π ^ p ) is larger than that from Strangers game ( π ^ s ). The difference between π ^ p and π ^ s , Π ^ , is the treatment effect. In Croson’s case, the estimated asymptotic treatment effect is around 3.1, but its standard error is too big, so that the null hypothesis of no difference cannot be rejected. In Keser and Winden’s case, the estimated asymptotic treatment effect becomes around 15, but again its standard error is also very large. Hence, in both cases, the null hypothesis cannot be rejected. Partners appear to contribute more than Strangers in both of the studies, but large standard errors result in insignificant differences.

6. Monte Carlo Study

This section examines the finite sample performance of the proposed estimators and tests. The data generating process is given by
y i t = μ i ρ t 1 + e i t , e i t = u i t ρ t 1 , u i t i i d B 0 , σ 2 μ i μ i , μ i i i d B 0.5 , σ μ 2 0 1 .
In addition, we impose the restriction of 0 y i t 1 . The parameters are set based on the results in Table 2, as follows: ρ = 0.85 , 0.9 , 0.95 , σ μ 2 = [ 0.15 , 0.12 , 0.10 ], σ 2 = 0.05 , 0.03 , 0.01 , T = 10 , 20 and N = 20 , 40 , 100 , 200 . All errors are generated from a truncated normal distribution. The total number of replications is 2000. Note that u i t is assumed to be serially independent here, but allowing serial dependence does not alter the main findings in this section. To be specific, we allow u i t = ϕ i u i t 1 + ϵ i t where ϕ i i i d U 0 . 99 , 0 . 99 . Under this setting, the results of the Monte Carlo simulation become slightly better compared to those under serial independence. To save space, only a few highlighted results under serial independence are reported here, but the rest of results are available on the authors’ website.
Table 4 reports the finite sample performance of the NLS and LS estimators given in (16) and (19). Both estimators do not show any bias at all. The variances of μ ^ nls and μ ^ are similar, but the variance of μ ^ is slightly smaller than that of μ ^ nls as our theory predicts. Meanwhile, the variance of ρ ^ is much smaller than that of ρ ^ nls , and the efficiency gain of ρ ^ over ρ ^ nls becomes dominant as T becomes bigger. When T = 10 , the variance of NLS estimator is approximately 1.3 times larger than that of the LS estimator. However, as T approaches 20, the variance of the NLS estimator becomes about three times larger than that of the LS estimator. Hence, asymptotic theory established in Theorem 1 also explains the finite sample properties of both estimators very well.
Table 5 exhibits the size and power of the test. Under the null of no treatment effect, we set μ 1 = μ 2 = 0 . 5 and ρ 1 = ρ 2 = 0 . 9 , but σ μ 2 and σ 2 v a r y . Overall, when T is small or σ 2 is large, the proposed test suffers from a mild size distortion. However, the distortion goes away very quickly either as T increases or if σ 2 is small. Under the alternative, we decrease the decay rate for the second game from 0.9 to 0.85: ρ 2 = 0 . 85 . Even with such a small change, the proposed test captures the treatment effects very well. Regardless of the values of σ μ 2 , σ 2 and T , the power of the test becomes perfect as long as N 100 . Even with N = 50 , the power of the test approximately reaches 90%.

7. Conclusions

This paper deals with the repeated public donation games and provides a simple but efficient method to test overall treatment effects. We assume that the cross-sectional average of experimental outcomes follows AR(1) process. In the initial round, subjects do not have any information on other players. Over rounds, subjects are learning about the game and other subjects’ behaviors. Hence, the distribution of the initial average differs from the rest of averages. When the initial average differs from the long run mean, this nonstationary initial condition model generates a temporal time decay function. We use the simple nonstationary initial condition model to approximate the cross-sectional averages over time. The nonstationary initial condition model is a nonlinear function of three key parameters: initial average ( μ ) , time decay rate or AR(1) coefficient ρ , and long run average ( a ) . The long run average is not identifiable unless a game repeats many times. Hence, we had to assume that this value is known. Under this restrictive assumption, we provide a logarithm approximation of the nonlinear function, which becomes a simple log trend regression. Comparing with previous methods, the newly suggested method takes statistical inference very seriously.
By means of Monte Carlo simulations, we showed that the finite sample performance of the suggested method is reasonably good. We applied the new estimation method to Croson (1996) and Keser and Van Winden (2000)’s experiments. The estimates of the overall outcomes in Partners games are larger than those in Strangers games. However, due to large standard errors, which come from persistent values of the time decay functions, the difference between the two games is not statistically significant.

Author Contributions

All authors contributed equally to the paper.

Funding

This research received no external funding.

Acknowledgments

We thank Ryan Greenaway-McGrevy for editorial assistance. In addition, we are grateful to Eunyi Chung, Joon Park and Yoosoon Chang for thoughtful comments, and thank Catherine Eckel, Rachel Croson, Daniel Houser, Sherry Li, James Walker and Arlington Williams for helpful discussions.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Proofs of Theorem

Theorem 1 (i)
Let f μ , ρ = μ ρ t 1 . Then, the first derivatives with respect to μ and ρ are given by
f μ = f / μ f ρ = f / ρ = ρ t 1 μ t 1 ρ t 2 .
Next, consider the following covariance and variance matrix:
E t = 1 T f μ ε N t t = 1 T f ρ ε N t t = 1 T f μ ε N t t = 1 T f ρ ε N t = Σ 11 Σ 12 Σ 12 Σ 22 ,
where
Σ 11 = 1 N 1 ρ 4 T 1 ρ 4 σ 2 + 1 ρ 2 T 1 ρ 2 2 1 N σ μ 2 = 1 N σ 2 1 ρ 4 + σ μ 2 1 ρ 2 2 for a large T .
Σ 12 = μ N ρ 3 1 ρ 4 T T ρ 4 T 4 1 ρ 4 1 ρ 4 2 σ 2 + σ μ 2 1 ρ 2 T 1 ρ 2 ρ 2 ρ 2 T + 2 ρ 2 T T 1 ρ 2 1 ρ 2 2 = μ N ρ 2 ρ 1 ρ 4 2 σ 2 + 1 1 ρ 2 3 σ μ 2 for a large T .
Σ 22 = μ 2 N ρ 2 1 ρ 4 T 1 + ρ 4 1 ρ 4 3 ρ 2 T 1 ρ 4 + 2 ρ 4 ρ 4 T 4 T 1 ρ 4 2 σ 2 + μ 2 N ρ 2 1 ρ 2 T ρ 2 T T 1 ρ 2 1 ρ 2 2 2 σ μ 2 = μ 2 N 1 + ρ 4 1 ρ 4 3 ρ 2 σ 2 + ρ 4 1 ρ 2 4 σ μ 2 for a large T .
Hence, it is straightforward to show that as N , T jointly, the limiting distribution of NLS estimators are given by
N μ ^ nls μ N ρ ^ nls ρ d N 0 0 , Ω 11 Ω 12 Ω 12 Ω 22 ,
where
Ω 11 = 3 ρ 2 + ρ 4 5 ρ 6 + 1 1 + ρ 2 3 σ 2 + 1 + 2 ρ 2 2 ρ ρ 3 1 + ρ 2 σ μ 2 , Ω 12 = ρ 1 ρ 2 2 1 + 3 ρ 2 σ 2 μ 1 + ρ 2 3 1 + ρ 2 ρ 3 1 ρ 1 ρ 2 μ σ μ 2 ρ , Ω 22 = 2 1 ρ 2 3 ρ 2 μ 2 1 + ρ 2 3 σ 2 + 1 ρ 2 1 ρ 2 2 ρ 2 σ μ 2 μ 2 .
Theorem 1 (ii)
Let
log y N t = log μ + log ρ t 1 + v N t ,
where
v N t = u N t μ + μ N μ μ + O p N 1 .
Let the expectation of the covariance and variance matrix as
E t = 1 T v N t t = 1 T t 1 v N t t = 1 T v N t t = 1 T t 1 v N t = σ 11 σ 12 σ 12 σ 22 ,
where
σ 11 = 1 N μ 2 T σ 2 + T 2 σ μ 2 , σ 12 = 1 N μ 2 1 2 T 2 1 2 T σ 2 + 1 2 T 2 1 2 T T σ μ 2 , σ 22 = 1 N μ 2 1 6 T 1 2 T 2 + 1 3 T 3 σ 2 + 1 2 T 2 1 2 T 2 σ μ 2 .
Therefore, the limiting distribution of the LS estimators in the logged trend regression is given by
N log μ ^ log μ N T 3 / 2 log ρ ^ log ρ d N 0 0 , 1 μ 2 σ μ 2 0 0 12 σ 2 .
By using Delta method, it is easy to show that
N μ ^ μ N T 3 / 2 ρ ^ ρ d N 0 0 , σ μ 2 0 0 12 ρ 2 σ 2 / μ 2 .
Alternatively, the same limiting distribution can be derived by redefining the regression parameters as
log y N t = log μ N + log ρ t 1 + v N t ,
where
v N t = e N t μ N + O p N 1 .
Note that
E t = 1 T v N t 2 = E 1 N i = 1 N t = 1 T e i t 2 1 N i = 1 N μ i 2 = 1 N μ 2 T σ 2 ,
E t = 1 T v N t t = 1 T t 1 v N t = 1 N μ 2 1 2 T 2 1 2 T σ 2 ,
E t = 1 T t 1 v N t 2 = 1 N μ 2 1 6 T 1 2 T 2 + 1 3 T 3 σ 2 ,
so that the limiting distributions can be rewritten as
N T log μ N ^ log μ N T 3 / 2 log ρ ^ log ρ d N 0 0 , 1 μ 2 4 σ 2 6 σ 2 6 σ 2 12 σ 2 .
By using the delta method, the limiting distributions of μ ^ N μ N and ρ ^ ρ can be rewritten as
N T μ ^ N μ N N T 3 / 2 ρ ^ ρ d N 0 0 , 4 σ 2 6 σ 2 / μ 2 6 σ 2 / μ 2 12 σ 2 ρ 2 / μ 2 .
Finally, since μ ^ N μ N = μ ^ N μ μ N μ , the limiting distributions of μ ^ N μ and ρ ^ ρ are given by (25).

Appendix B. Proof of Remark 5

Note that the initial condition for the expectation error is nonstationary. That is, the second moments of u i t become dependent on t :
E u i t 2 = 1 ϕ 2 t 1 ϕ 2 σ 2 , E u i t u i t j = ϕ j 1 ϕ 2 t j 1 ϕ 2 σ 2 .
Define
Ω 1 = N × E T 1 t = 1 T v N t T 1 t = 1 T t 1 v N t T 1 t = 1 T v N t T 1 t = 1 T t 1 v N t = ω 11 ω 12 ω 12 ω 22 ,
where v N t = u N , t t β ρ 1 t / μ N .
Each element of the covariance matrix Ω 1 is defined as
ω 11 : = N × E T 1 t = 1 T v N t 2 = 1 μ 2 E 1 T 2 N i = 1 N t = 0 T 1 u i t ρ t 2 = 1 μ 2 1 T 2 t = 1 T 1 ϕ 2 t 1 ϕ 2 t 2 β ρ 2 t 2 σ 2 + 1 μ 2 2 T 2 j = 1 T t = j + 1 T ϕ j 1 ϕ 2 t j 1 ϕ 2 t β j β ρ t + j σ 2 ,
since
E t = 0 T 1 u i t t β ρ t 2 = t = 1 T 1 ϕ 2 t 1 ϕ 2 t 2 β ρ 2 t 2 σ 2 + 2 E j = 1 T t = j + 1 T u i t t β ρ t u i j j β ρ j = t = 1 T 1 ϕ 2 t 1 ϕ 2 t 2 β ρ 2 t 2 σ 2 + 2 j = 1 T t = j + 1 T ϕ j 1 ϕ 2 t j 1 ϕ 2 t β j β ρ t + j 2 σ 2 .
In addition,
ω 12 = N × E T 1 t = 1 T v N t T 1 t = 1 T t 1 v N t = 1 μ 2 1 T 2 N E i = 1 N t = 1 T t 1 ϵ i t t β ρ t 1 i = 1 N t = 1 T ϵ i t t β ρ t 1 = 1 T 2 σ 2 μ 2 t = 1 T t 1 1 ϕ 2 t 1 ϕ 2 t 2 β ρ 2 t 2 + 2 j = 1 T t = j + 1 T t j ϕ j 1 ϕ 2 t j 1 ϕ 2 t β j β ρ t + j 2 ,
ω 22 = N × E T 1 t = 1 T t 1 v N t 2 = 1 T 2 N 1 μ 2 E i = 1 N t = 1 T t 1 u i t t β ρ 2 t 2 2 = 1 T 2 σ 2 μ 2 t = 1 T t 1 2 1 ϕ 2 t 1 ϕ 2 t 2 β ρ 2 t 2 + 2 j = 1 T t = j + 1 T t 1 t j ϕ j 1 ϕ 2 t j 1 ϕ 2 t β j β ρ t + j 2 .
Then, the limiting distribution of log μ N ^ is given by
N log μ N ^ log μ d N 0 , ω 11 + σ μ 2 / μ 2 .
Hence, the joint limiting distribution can be written as
N log μ N ^ log μ N log ρ ^ log ρ d N 0 0 , Ω for Ω = ω 11 + σ μ 2 / μ 2 ω 12 ω 12 ω 22 .

Appendix C. Proof of Remark 6

As t , the effect of the nonstationary initial condition, or the ρ t term, goes away very quickly. Assumption B1 ensures that the decay rate does not go away even when t . The following two lemmas are helpful in deriving the limiting distribution. Let r 0 , 1 and then t / T r as T .
Lemma A1.
Let ρ T = exp η / T for 0 < η < ε for any small ε > 0 . Then, ρ T 2 t = 1 + 2 η r + o r 2 .
Proof of Lemma A1.
By definition, ln ρ T = η T , so that the following hold:
2 t ln ρ T = 2 η t T = 2 η r = ln 1 + 2 η r + o η 2 r 2 .
Hence,
ρ T 2 t = 1 + 2 η r + o r 2 and ρ T 2 t = 1 1 + 2 η r + o η 2 r 2 .
 ☐
Proof of Remark 6.
Let
log μ N ^ log μ log ρ ^ log ρ = T 1 t = 1 T 1 T 1 t = 0 T 1 t T 1 t = 0 T 1 t T 1 t = 0 T 1 t 2 1 T 1 t = 1 T v N t T 1 t = 1 T t 1 v N t .
Since μ i is independent of u i t , the asymptotic variances of the denominator term are given by
E T 1 t = 1 T v N t 2 = E 1 N i = 1 N μ i μ μ + 1 T N i = 1 N t = 1 T v N t 2 = 1 N σ μ 2 μ 2 + 1 T 2 N 2 i = 1 N ω i i 2 μ 2 t = 0 T 1 t β ρ t 2 1 N σ μ 2 μ 2 + T 2 β N ω 2 μ 2 0 1 r 2 β 1 + 2 η r d r ,
since
1 T 2 t = 0 T t β ρ t 2 = T 1 2 β t = 0 T 1 t T 2 β ρ 2 t 1 T = T 1 2 β t = 0 T 1 t T 2 β 1 1 + 2 η r 1 T T 1 2 β 0 1 r 2 β 1 + 2 η r d r .
In addition,
E T 1 t = 1 T v N t T 1 t = 1 T t 1 v N t = T 2 N σ μ 2 μ 2 + 1 T 2 N 2 i = 1 N ω i i 2 μ 2 t = 0 T 1 t 1 2 β ρ 2 t T 2 N σ μ 2 μ 2 + T 2 β N ω 2 μ 2 0 1 r 1 2 β 1 + 2 η r d r ,
E T 1 t = 1 T t 1 v N t 2 = T 2 4 N σ μ 2 μ 2 + 1 T 2 N 2 i = 1 N ω i i 2 μ 2 t = 0 T 1 t 2 2 β ρ 2 t T 2 4 N σ μ 2 μ 2 + T 1 2 β N ω 2 μ 2 0 1 r 2 2 β 1 + 2 η r d r .
Therefore,
N log μ N ^ log μ N T 3 + 2 β log ρ ^ log ρ d N 0 0 , σ μ 2 / μ 2 0 0 36 ω 2 μ 2 0 1 r 2 β 1 2 r 2 1 + 2 η r d r .
 ☐

References

  1. Ambrus, Attila, and Ben Greiner. 2012. Imperfect Public Monitoring with Costly Punishment: An Experimental Study. American Economic Review 102: 3317–32. [Google Scholar] [CrossRef]
  2. Andreoni, James. 1988. Why Free Ride? Strategies and Learning in Public Goods Experiments. Journal of Public Economics 37: 291–304. [Google Scholar] [CrossRef]
  3. Andreoni, James, and Rachel Croson. 2008. Partners versus Strangers: Random Rematching in Public Goods Experiments. In Handbook of Experimental Economics Results. Amsterdam: Elsevier. [Google Scholar]
  4. Ashley, Richard, Sheryl Ball, and Catherine Eckel. 2010. Motives for Giving: A Reanalysis of Two Classic Public Goods Experiments. Southern Economic Journal 77: 15–26. [Google Scholar] [CrossRef]
  5. Chao, John, Myungsup Kim, and Donggyu Sul. 2014. Mean Average Estimation of Dynamic Panel Models with Nonstationary Initial Condition. In Advances in Econometrics. Bingley: Emerald Group Publishing Limited, vol. 33, pp. 241–79. [Google Scholar]
  6. Croson, Rachel. 1996. Partners and Strangers Revisited. Economics Letters 53: 25–32. [Google Scholar] [CrossRef]
  7. Dal Bó, Pedro, and Guillaume R. Fréchette. 2011. The Evolution of Cooperation in Infinitely Repeated Games: Experimental Evidence. American Economic Review 101: 411–29. [Google Scholar] [CrossRef]
  8. Keser, Claudia, and Frans Van Winden. 2000. Conditional Cooperation and Voluntary Contributions to Public Goods. Scandinavian Journal of Economics 102: 23–39. [Google Scholar] [CrossRef] [Green Version]
  9. Kong, Jianning, and Donggyu Sul. 2013. Estimation of Treatment Effects under Multiple Equilibria in Repeated Public Good Experiments. New York: mimeo, Richardson: University of Texas at Dallas. [Google Scholar]
  10. Kong, Jianning, Peter C. B. Phillips, and Donggyu Sul. 2018. Weak σ- Convergence: Theory and Applications. New York: mimeo, Richardson: University of Texas at Dallas. [Google Scholar]
  11. Ledyard, John O. 1995. Public Goods: A Survey of Experimental Research. In The Handbook of Experimental Economics. Edited by John H. Kagel and Alvin E. Roth. Princeton: Princeton University Press. [Google Scholar]
  12. Malinvaud, Edmond. 1970. The Consistency of Nonlinear Regressions. Annals of Mathematical Statistics 41: 956–69. [Google Scholar] [CrossRef]
  13. Newey, Whitney K., and Kenneth D. West. 1987. A Simple, Positive Semi-definite, Heteroskedasticity and Autocorrelation Consistent Covariance Matrix. Econometrica 55: 703–8. [Google Scholar] [CrossRef]
  14. Phillips, Peter C. B., and Donggyu Sul. 2007. Transition modeling and econometric convergence tests. Econometrica 75: 1771–855. [Google Scholar] [CrossRef]
  15. Wu, Chien-Fu. 1981. Asymptotic Theory of Nonlinear Least Squares Estimation. Annals of Statistics 9: 501–13. [Google Scholar] [CrossRef]
1
When the cross-sectional average is increasing over rounds, (but y i t is weakly σ -converging), the trend regression needs to be modified as log y n t 1 = log 1 μ + log ρ t 1 + error. Furthermore, the long run overall outcome becomes T 1 μ / 1 ρ .
2
Note that dynamic panel regressions or dynamic truncated regressions are invalid since usually the decay rate—AR(1) coefficient—is assumed to be homogeneous across different games. In addition, see Chao et al. (2014) for additional issues regarding the estimation of the dynamic panel regression under non-stationary initial conditions.
3
Croson (1996) designed two sequences for partners and strangers game. There were ten rounds for each sequence. In this paper, only the data from first sequence are used for the consistent comparison.
4
Assume that the sample cross-sectional average estimates the unknown common stochastic function π s consistently for every t as N . Then, by using a conventional spline method, we can approximate the unknown π s . The overall effects can be estimated by the sum of the approximated function π ^ s , and statistically evaluate it by using an HAC estimator defined by Newey and West (1987). However, this technical approach does not provide any statistical advantage over the AR(1) fitting, which we will discuss in the next section.
5
The decay function for the controlled and treated experiments are set as π c , t = 0 . 6 × 0 . 8 t 1 and π τ , t = 0 . 4 × 0 . 9 t 1 , respectively.
6
However, it is not always the case. More recent experimental studies show widely heterogeneous divergent behaviors. See Kong and Sul (2013) for detailed discussion.
7
Note that if the condition in (9) does not hold, then it implies that the variance of e N , t is increasing over time if the variance of ϵ N , t is time invariant. To be specific, let E ϵ N , t = σ ϵ 2 for all t. Then E e N , t 2 = σ ϵ 2 1 ρ t / 1 ρ , which is an increasing function over t .
8
If ρ = 0 , τ can be identified as long as n 3 is known, but φ cannot be identified. If ρ = 1 , both τ and φ cannot be identified jointly.
9
Since lim N , T E N 1 i = 1 N t = 1 T u i t ρ t 1 2 = σ 2 / 1 ρ 2 where σ 2 = N 1 i = 1 N σ i 2 , under (11), N e N , T d N 0 , σ 2 1 ρ 2 1 as N , T jointly.
10
If subjects do not know the number of repeated games, the dominant strategy could change to Pareto-optimum in an infinitely repeated game. See Dal Bó and Fréchette (2011) for a more detailed discussion.
11
More precisely speaking, we assume that the fraction of free riders in the long run becomes unity as the number of subjects goes to infinity. This assumption allows a few outliers such as altruists. As long as the number of altruists does not increase as the number of subjects increases, the asymptotics studied in the next section are valid.
12
Taking logarithm in both sides of Equation (16) yields log y N , t = log μ N ρ t 1 + e N , t = log [ μ N ρ t 1 × { 1 + e N , t / μ N ρ t 1 } ] = log μ N + log ρ t 1 + log 1 + e N , t / μ N ρ t 1 = log μ N + log ρ t 1 + log 1 + e N , t / μ N ρ t 1 .
13
We will show later that the point estimates of ρ for all three empirical examples are around 0.9. However, the choice of ρ does not matter much when comparing the two variances.
Figure 1. Average contributions over rounds from Croson (1996).
Figure 1. Average contributions over rounds from Croson (1996).
Econometrics 06 00043 g001
Figure 2. Overall treatment effects.
Figure 2. Overall treatment effects.
Econometrics 06 00043 g002
Figure 3. Overall treatment effects.
Figure 3. Overall treatment effects.
Econometrics 06 00043 g003
Table 1. Data description.
Table 1. Data description.
AuthorsCroson3Keser and Winden
StrangersPartnersStrangersPartners
subjects no.242412040
M P C R 0.5 0.5 0.5 0.5
G4444
T10102525
e25251010
Table 2. Estimation of trend regressions and homogeneity Tests.
Table 2. Estimation of trend regressions and homogeneity Tests.
CrosonKeser & Winden
PartnerStrangersPartnerStrangers
Total Number of Subjects242412040
Number of Pure Altruistic1001
γ ^ × 10 −0.098−0.043−0.049−0.016
(s.e × 10)0.0070.0200.0070.009
μ ^ 0.6140.4590.6180.381
(s.e)0.0800.0810.0330.036
ρ ^ 0.9120.8840.9720.934
(s.e)0.0210.0180.0140.002
Table 3. Asymptotic treatment effects.
Table 3. Asymptotic treatment effects.
Group/TreatmentCrosonKeser & Winden
π ^ p (s.e)7.000 (2.309)20.74 (12.18)
π ^ s (s.e)3.951 (1.051)5.798 (0.617)
Π ^ (s.e)3.050 (2.537)14.94 (12.20)
Table 4. Comparison between NLS and LS estimators.
Table 4. Comparison between NLS and LS estimators.
( σ 2 = 0.03 , ρ = 0.9 , μ = 0.5 )
E μ ˜ E ρ ˜ V μ ˜ × 10 3 V ρ ˜ × 10 5
σ μ 2 N T μ ^ nls μ ^ ρ ^ nls ρ ^ μ ^ nls μ ^ ρ ^ nls ρ ^
0.1525100.4990.4990.9000.9004.2054.1884.9273.891
0.1550100.5000.5000.9000.9002.2622.2522.4661.910
0.15100100.5000.5000.9000.9001.1691.1591.1500.900
0.15200100.5000.5000.9000.9000.5670.5570.6000.455
0.1525200.5010.5000.9000.9004.5184.4421.2590.502
0.1550200.5010.5000.9000.9002.3282.2610.5880.224
0.15100200.5000.5000.9000.9001.1321.1090.2960.116
0.15200200.5010.5000.9000.9000.5490.5350.1490.056
0.1225100.5000.4990.9000.9003.7523.7365.0303.977
0.1250100.5000.5000.9000.9002.0252.0142.5351.958
0.12100100.5000.5000.9000.9001.0501.0381.1860.931
0.12200100.5000.5000.9000.9000.5100.4990.6200.470
0.1225200.5010.5000.9000.9004.0413.9641.2860.516
0.1250200.5010.5000.9000.9002.0762.0080.6070.231
0.12100200.5000.5000.9000.9001.0120.9870.3070.120
0.12200200.5010.5000.9000.9000.4910.4760.1550.058
0.125100.5000.4990.9000.9003.3923.3765.1184.056
0.150100.5000.5000.9000.9001.8341.8222.5911.996
0.1100100.5000.5000.9000.9000.9510.9391.2180.958
0.1200100.5000.5000.9000.9000.4630.4530.6360.482
0.125200.5010.5000.9000.9003.6553.5761.3110.526
0.150200.5010.5000.9000.9001.8721.8020.6200.237
0.1100200.5000.5000.9000.9000.9150.8890.3130.122
0.1200200.5010.5000.9000.9000.4440.4300.1580.060
Table 5. Size and power of test.
Table 5. Size and power of test.
Size (5%): ρ 1 = ρ 2 = 0.9 , μ 1 = μ 2 = 0.5
σ 2 = 0.05 σ 2 = 0.03 σ 2 = 0.01
σ μ 2 N \ T 101520101520101520
0.15250.0180.0400.0420.0280.0510.0500.0410.0620.058
0.15500.0180.0230.0350.0250.0310.0440.0430.0420.052
0.151000.0180.0270.0400.0240.0340.0450.0380.0470.051
0.152000.0160.0280.0360.0230.0320.0430.0400.0390.051
0.12250.0150.0350.0410.0260.0470.0480.0380.0620.056
0.12500.0160.0200.0340.0220.0270.0410.0400.0420.052
0.121000.0140.0250.0380.0210.0320.0450.0360.0480.051
0.122000.0120.0240.0330.0220.0300.0380.0380.0410.052
0.10250.0130.0300.0390.0220.0410.0460.0380.0610.054
0.10500.0140.0200.0310.0210.0240.0370.0380.0390.049
0.101000.0120.0230.0330.0190.0310.0420.0360.0470.050
0.102000.0100.0200.0310.0180.0280.0360.0370.0400.049
Power (5%): ρ 1 = 0.9 , ρ 2 = 0.85 , μ 1 = μ 2 = 0 . 5
0.15250.3920.4760.5340.4720.5170.5620.5660.5580.597
0.15500.7150.8010.8120.7690.8280.8290.8410.8530.845
0.151000.9600.9800.9830.9710.9860.9860.9850.9890.990
0.152001.0001.0001.0001.0001.0001.0001.0001.0001.000
0.12250.4220.5160.5710.5090.5560.6040.6120.6060.632
0.12500.7430.8350.8450.8060.8600.8590.8720.8930.877
0.121000.9670.9900.9900.9820.9930.9930.9930.9940.996
0.122001.0001.0001.0001.0001.0001.0001.0001.0001.000
0.10250.4460.5460.6030.5370.5990.6420.6470.6560.677
0.10500.7760.8650.8730.8390.8950.8900.9040.9290.909
0.101000.9790.9940.9960.9880.9950.9980.9980.9990.998
0.102001.0001.0001.0001.0001.0001.0001.0001.0001.000

Share and Cite

MDPI and ACS Style

Kong, J.; Sul, D. Estimation of Treatment Effects in Repeated Public Goods Experiments. Econometrics 2018, 6, 43. https://0-doi-org.brum.beds.ac.uk/10.3390/econometrics6040043

AMA Style

Kong J, Sul D. Estimation of Treatment Effects in Repeated Public Goods Experiments. Econometrics. 2018; 6(4):43. https://0-doi-org.brum.beds.ac.uk/10.3390/econometrics6040043

Chicago/Turabian Style

Kong, Jianning, and Donggyu Sul. 2018. "Estimation of Treatment Effects in Repeated Public Goods Experiments" Econometrics 6, no. 4: 43. https://0-doi-org.brum.beds.ac.uk/10.3390/econometrics6040043

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop