3.1. Methodology
As mentioned above, there are a numerous techniques that have emerged, both parametric and nonparametric, in the operational research dealing with the measurement of efficiency. Among the nonparametric techniques used to estimate production frontiers and evaluate efficiency is the data envelopment analysis (DEA), which objectifies the results of an entity by measuring them in relation to the best results achieved by the rest.
DEA was developed for the manufacture of physical goods, where the inputs are those of the production process and the outputs are the units produced or the services provided. If we are talking about SEs, what the “alternative” production process seeks is the conversion of economic results into social achievements. Although many SEs do not seek optimization of this transformation process as their main objective, it allows the analysis of this process in terms of efficiency [
32].
We applied a semiparametric two-stage double bootstrap DEA approach, specifically, Algorithm II developed by Simar and Wilson [
3]. In the first stage, both efficiency scores and confidence intervals were calculated combining the classic DEA model with the bootstrap procedure. In the second stage, efficiency estimates were regressed on a set of environmental variables using the truncated regression with bootstrap.
First stage: DEA efficiency scores
DEA estimates the production frontier by using linear programming techniques that calculate the efficiency score of a DMU (decision-making unit) with respect to homogeneous entities.
Our study focused on the Galician SWs, a homogeneous population of SEs that present similar characteristics and belong to one production frontier: (i) common social mission (employing people with disabilities), (ii) similar activities (the vast majority belong to the services sector and carry out activities with few qualifications and low added value), (iii) similar regulatory environment, and (iv) similar legal form.
There are many DEA models proposed in literature (static, dynamic, with different returns to scale and orientation), also varied inputs and outputs proposed for analysis in the first stage and exogenous variables to be included in the second stage, however there is no guide to choose the model and the appropriate variables in each stage [
33].
We used the CCR (Charnes, Cooper and Rhode) ([
34] and BCC (Banker, Charnes and Cooper) [
35] models. Both are based on radial efficiency measurements, and can be carried out from both orientations (either input or output). The CCR uses constant returns to scale and the BCC variables. The latter assumes that the evaluated entity may be operating under the variable returns to scale hypothesis, implying that the relative efficiency of each DMU is obtained by comparing that DMU with those that have produced efficient results possess similar operational dimensions.
We considered the output orientation, since economic units usually aim to maximize profits with an adequate combination of productive factors (inputs). The output approach from the point of view of SWs is also more appropriate.
The DEA, being a nonparametric method, does not require prior knowledge of the production function, supports production units with multiple inputs and outputs, does not depend on parameters that determine a priori the relationship between the two, and the data are known for certain. The linear programming problem must be solved for each of the SWs.
The closer to unity, the more efficient a company is, and the most efficient SWs would be on the border. Inefficiency is measured by the distance between the SW and the efficient border.
For each period t (t = 1, …, T), we considered a group of n DMUs (i = 1, 2, …, n), for which we considered a set of q outputs (r = 1, 2, …, q) that produce Yit = {yrit} and p inputs (j = 1, 2, …, p) that consume Xit = {xjit}.
The mathematical formulation of the CCR output-oriented model for estimating the efficiency of a decision-making unit 0 (DMU 0) is as follows:
where ϕ is the efficiency score and
λ is the weight.
Since the CCR model considers the hypothesis of constant returns to scale, and in order to avoid the difficulties associated with measuring efficiency in units biased by scale inefficiencies, Banker, Charnes, and Cooper [
35] proposed an alternative model (BCC model), which assumes the variable returns to scale hypothesis by adding the constraint
to the CCR model, which calculates pure technical efficiency (ETP) scores that take into account the scale of operations of efficient companies with respect to the DMU evaluated in each case.
In the practical application, we used a windows analysis, proposed by Charnes et al. [
36], because the empirical study was carried out on panel data, comparing each DMU with itself in different periods of time. We set window width to 1 year, so this was equivalent to dividing the panel data into many datasets (each dataset contains one year’s data) and analyzing each dataset one by one.
DEA offers several advantages when examining the performance of SWs. In the context of SWs, social and economic indicators are expressed in different measurement units, both nonfinancial and financial. For instance, the number of disadvantaged employees for SWs expresses their social performance, while their economic performance is expressed using monetized financial accounting variables. Measuring economic and social performance together can cause aggregation problems related to weightings given to both aspects, and standardization problems, when handling different objectives with different units of measurement [
7].
DEA can be particularly useful in these cases since it does not require prior judgment on weighting granted to social and economic aspects, DEA is characterized by benefit of the doubt, which allows specifying weightings endogenously. Furthermore, it is invariant to measurement units for both input and outputs and it facilitates the usage of these important measures in a unified performance index.
DEA, in addition to efficiency indicators, will provide information to improve management of the inefficiency of the DMUs (reduce inputs, increase outputs), to identify the set of efficient SWs and the slack variables. This information will allow us to advise actions to increase efficiency and be more competitive from an economic–financial and social point of view.
DEA methodology also has different limitations. The most important is that it tends to generate biased estimates. In order to correct the problems associated with the sampling noise in the resulting efficiency DEA estimators, and within the first stage initiated with the DEA, we used the procedure proposed by Simar and Wilson [
37] for bootstrapping the initial efficiency scores and obtaining bias-corrected efficiency estimations
.
Second stage: Truncated Regression
Next, we applied the Simar and Wilson [
3] truncated regression model, Algorithm II, developed to determine the explanatory character that certain exogenous variables have over efficiency levels. The basic idea of the analysis of this second stage is based on the idea that the efficiency levels of the SWs analyzed depend on a number of environmental factors that are basically not controllable by these.
The second stage regression is given by:
where
is the dependent variable, the bootstrapped bias-corrected efficiency score for DMU
i for each year
t;
Zit is a vector of environmental variables which is expected to explain the efficiency variations;
Dt is a vector of year dummies (from 2009 to 2017, with 2008 being the reference year);
β and
δ are the parameters to be estimated in the second stage that establishes the relationship between independent variables and efficiency and the annual effects on efficiency, respectively;
is an independent error term that follows the normal distribution with a zero mean and
variance
with left-tail truncation (1 −
Z
it).
Algorithm II by Simar and Wilson [
3] has been applied to the estimation of the regression model using the double bootstrap procedure. The steps of the algorithm are presented below:
For each period t = 1, …, T, we used the estimated efficiency score for each SW and year t .
We used the method of maximum likelihood to obtain an estimate of β and of as well as of in the truncated regression of on (Zit) and (
For each (i = 1, …, nt) and (t = 1, …, T), we replicated the followings steps B1 times, to obtain B1 bootstrap estimates
Generate the residual from the normal distribution N(0, ) left-truncation .
Estimate
Produce a pseudo-dataset where and
Use the pseudo-dataset to estimate the pseudo efficiency score DEA
For each (
i = 1, …,
nt) and (
t = 1, …,
T), we calculated the bias corrected efficiency, as:
is the bootstrap estimator of bias, obtained following Simar and Wilson [
38]:
We computed a truncated maximum likelihood estimation to regress the bias corrected efficiency scores against the context variables (Zit) and ( to obtain of , of and of .
We replicated the following steps B2 times to obtain a set of bootstrap estimates
For each (i = 1, …, nt) and t = 1, …, T, draw from the N(0, ) distribution with left-truncation at .
For each (i = 1, …, nt) and t = 1, …, T, compute
Use the maximum likelihood method to estimate the truncated regression of on (Zit) and ( to obtain of β, of and of .
We calculated confidence interval and standard errors for , and from bootstrap distribution of and
This analysis was carried out for total efficiency, for economic efficiency, and for social efficiency. In addition, for robustness, a sensitivity analysis was proposed for different input and output variables in the DEA model and, later, in the second stage, we re-estimated the truncated regression for two possible alternative models.