The association between socioeconomic position and pregnancy outcomes is well established [1
]. A variety of indicators of socioeconomic disadvantage, at both the neighbourhood and individual level, have been found significantly associated with adverse outcomes across countries and ethnic groups [3
]. The most plausible causal mechanisms and mediating factors include smoking, mostly for intrauterine growth restriction, and genital infections and psychosocial factors, mainly for preterm births [1
]. However, none of these behavioural determinants fully explain the observed social inequalities, and it is necessary to better understand their aetiology in pregnancy outcomes [4
]. Moreover, a recent European comparative study highlighted educational inequalities in the risk of preterm birth in most of, but not all, the cohorts analysed [5
], suggesting the presence of other potential determinants.
Another strand of research, in fact, focuses on contextual factors, meant as both socioeconomic characteristics of neighbourhoods and environmental pollution. As for the latter, the evidence produced so far on the association between air pollution and different birth outcomes is incomplete [6
]. Recent findings add complexity to the picture, reporting evidence of an interactive effect of maternal behaviours and contextual factors on birthweight in Canada [8
]. Systematic reviews pointed out that the heterogeneity or absence of association with air pollution reported in some studies may be due to difficulties in either quantifying exposure or adjusting for residential mobility [7
Despite the amount of literature on the two key determinants, studies analysing them conjunctly in order to understand the single and combined impact of environmental and individual risk factors are scant [3
]. Furthermore, most of the published studies rely on North American populations or on selected hospital-based birth cohorts, whose results are not directly generalizable to southern European population-based birth cohorts [10
In a few Italian cities, population-based longitudinal cohorts are available. Starting from there, we established a network of population-based birth cohorts in the cities of Turin, Reggio Emilia, Modena, Bologna, and Rome (northern and central Italy), to study the role of air pollution exposure and socioeconomic status (SES) on term low birthweight, preterm births, small for gestational age, and pregnancy-induced hypertensive complications. In this first article, we outline the methodology and database of the study and the preliminary descriptive results.
2. Materials and Methods
The Italian network of population-based birth cohorts stems from the larger Italian Network of Longitudinal Metropolitan Studies (IN-LiMeS), a multicentre set of metropolitan population cohorts enrolled in nine Italian cities, namely Turin, Venice, Reggio Emilia, Modena, Bologna, Florence, Leghorn, Prato and Rome for monitoring socioeconomic inequalities in health [11
]. The inclusion of Longitudinal Metropolitan Studies in the National Statistical Programme (NSP) complies with the national legislation on the processing of personal data for statistical and scientific research purposes, for each single longitudinal study (old and new releases of the NSP are available at https://www.sistan.it/index.php?id=52
In order to identify birth cohorts and analyse the role of socioeconomic and environmental factors on pregnancy outcomes, we established a sub-network of cities with available data, resulting in 5 cities from northern and central Italy: Turin, three cities from the Emilia-Romagna (ER) region (Bologna, Reggio Emilia and Modena) and Rome. The Ministry of Health funded this project (grant RF-2011-02352442), named “Long term Exposure to Air pollution and Pregnancy outcomes”, the LEAP study.
The Italian network of population-based birth cohorts gathered demographic, socioeconomic and clinical information at individual level. The minimum core of population data to participate in the network included the following sources, linked to each other through an individual identification code: (1) birth certificates, (2) the municipal population register and (3) hospital discharge records. In addition, to participate in the LEAP study, we needed both geocoded residential addresses of women and small-scale models for residential air pollution exposures.
According to the availability of the data in all cities at the beginning of the project, we used the 2007–2013 birth cohorts and selected all singleton livebirths from women aged 15–49 years at delivery, who were residents in the 5 cities. From municipal registries, we retrieved information on the complete residential history of the women during the whole pregnancy from the date of the last menstruation, with the twofold objective of selecting only those who lived in the cities for the entire pregnancy and attributing them a more accurate level of air pollution exposure.
2.2. Socioeconomic, Demographic and Obstetric Characteristics
The dataset contained several maternal socioeconomic and demographic characteristics taken from the birth certificates, such as date of birth, level of education (university, high school, junior high school, primary school), occupational status (employed, unemployed, looking for the 1st occupation, student, housewife, other), marital status (not-married, married, separated, divorced, widowed), and citizenship (Italian, foreigner). It also included some information on maternal obstetric history and on pregnancy and new-borns: date of the last menstrual period, number of previous pregnancies/abortions, parity, gestational age, type of delivery, sex of the new-born, birthweight (g), length (cm), cranial circumference (cm) and Apgar score.
To account for contextual socioeconomic characteristics, we retrieved a composite indicator of deprivation at the census block level and linked it to the census block of the mother’s residence. This indicator, available for both 2001 and 2011 census data, included five elementary components, expression of both material and social deprivation, namely the standardized percentages of low education, unemployment, home tenancy, lone-parent family and overcrowding [12
]. We then classified the continuous index in five categories, based on the percentiles of resident population within each city, so that each category represented about 20% of the city’s population, ordered by the deprivation score (1 = least deprived to 5 = most deprived).
As pregnancy outcomes to be investigated, we computed the following dummy variables: preterm births (births at <37 gestational weeks); low birthweight (i.e., <2500 g at birth) among term births; and small for gestational age (<10th percentile), based on the distribution of new-borns to Italian mothers, by infant sex, gestational age and parity [14
]. Moreover, from the hospital discharge records we gathered information on maternal discharges during pregnancy, searching specifically for diagnoses of hypertension (ICD-9: 642), mild preeclampsia (ICD-9: 642.4), severe preeclampsia (ICD-9: 642.5), eclampsia (ICD-9: 642.6), or any preeclampsia-eclampsia (ICD-9: 642.4–642.7). The results on pregnancy-induced hypertensive complications, however, are outside the scope of this paper and will be fully reported in a subsequent paper.
2.3. Air Pollution Exposure Assessment
2.3.1. Standard Models
Data for Turin and Rome were derived from the European Study of Cohorts for Air Pollution Effects (ESCAPE), which provided Land use regression (LUR) models to estimate 2010 annual average concentrations of particulate matters (PM10, PM-coarse, PM2.5, PM2.5-absorbance), and nitrogen oxides (NO2
and NOx) [15
]. Briefly, LUR models are useful to estimate the spatial variability of air pollution in urban areas, with the assumption that the spatial variability of pollutants’ concentrations does not change over time. In Turin and Rome, particulate matter of varying sizes was measured in 20 sites, and nitrogen oxides were measured in 40 sites in three separate two-week periods (to cover different seasons) over 2010. The three measurements were averaged to estimate the annual average of each pollutant, adjusting for temporal variation by using a centrally located background reference site, which was operating for a whole year. By using several variables (i.e., altitude, population density, industrial land use, green space, and traffic flow variables), city-specific land use regression (LUR) models were developed to explain the spatial variation of each measured pollutant. The R2
of the models ranged from 0.70 to 0.84 in Rome, and from 0.70 to 0.88 in Turin.
Using the same ESCAPE methodology, LUR models for nitrogen dioxide (NO2
) have been developed also in Bologna, within the Strategic Programme Environment and Health (supported by the Italian Ministry of Health) and in Reggio Emilia and Modena within the LEAP study; the R2
of the models were 0.79, 0.72 and 0.78, respectively. As for PM10 and PM2.5, we used available European LUR models [17
], together with dispersion models provided by the local Environmental Protection Agency to assign exposure values in Bologna, Modena and Reggio Emilia.
The city-specific models were then used to estimate the concentrations of air pollution at the residential coordinates of all woman’s addresses, and to calculate the individual average exposure weighted for the time of residence in each address.
2.3.2. Back-Extrapolated Models
Since the time windows of vulnerability to environmental exposures could differ for each specific pregnancy outcome, and to take account of the slow and progressive decrease in air pollution concentrations in the last 15–20 years, we calculated back-extrapolated exposure during the entire pregnancy, in each trimester, in the first five months, and in the last week of pregnancy. In all the included cities, the concentrations of air pollutants (gaseous and particles) were slowly but progressively reducing over time during the last 15–20 years. LUR models have been developed for spatial variations under the hypothesis of their stability over time. Therefore, in order to account also for the observed temporal variations, we derived additional estimates of exposure with extrapolation techniques. Following literature suggestions [19
], we used data from routine background monitoring stations (available for PM10, PM2.5 and NO2
from 2007) to temporally adjust the LUR estimates to the periods corresponding to each individual pregnancy and trimester of pregnancy. Briefly, after collection of daily air pollution data from background monitoring sites that cover the period 2007–2013 for each city, we followed these steps:
We calculated the “annual” average concentration for the background monitoring sites during the measurement period used to build standard models (MLUR).
For each day from 2007–2013, we calculated the ratio between the daily concentration (DC) and the annual average covering the LUR-models measurement period: Ratio = DC/MLUR.
For each day, we calculated the extrapolated concentration by multiplying the modelled LUR concentration attributed to each subject (CLEAP) with the Ratio: C extrapolated = CLEAP × Ratio.
We calculated both an average exposure during pregnancy, using extrapolated temporally adjusted exposures, and trimester specific exposures.
In order to account for the nine months of pregnancy, we attributed these back-extrapolated exposures only to births occurred from 2008 to 2013.
For brevity reasons, however, unless otherwise specified, in this paper we will refer to the estimates derived from the standard models.
2.3.3. Additional Exposure Models
In addition to particulate matter and nitrogen oxides, in Turin and Rome, LUR models for metal components of PM10 and PM2.5 were available from the TRANSPHORM project [20
]. The elements were selected based upon evidence for health effects (toxicity), a high percentage of detected samples (>75%), and representation of major anthropogenic sources. We selected Cu, Fe, and Zn mainly for (non-tailpipe) traffic emissions; Ni and V for mixed oil burning/industry; S for long-range transport; Si for crustal material; and K for biomass burning. The elements chosen do not necessarily represent single sources. The measurements were taken at the same time of particulate matter, and the LUR models have been developed centrally. The models performed better for PM10 than for PM2.5 components: the average R2
for elements in PM10 and PM2.5 were in Rome 0.75 and 0.75 respectively, and in Turin 0.76 and 0.63, respectively [20
Estimates of exposure to particulate matter constituents in Turin and Rome are included in the full database, but they will not be further illustrated here, as we will focus only on data available in all cities.
We ended up with a pooled cohort including more than 200,000 births, in five Italian cities and seven years of observation. It is the largest Italian population-based birth cohort, built on individual record-linkage between current administrative databases, censuses and health information systems, which has the advantage of being economic and sustainable over time. The very high success rates at linkage guarantee that the cohorts include all births in the area of interest and are not distorted by selection mechanisms or social desirability bias, such as birth cohorts based on voluntary enrolment and/or face-to-face interviews on socioeconomic characteristics. Looking forward, also the follow-up of mothers and new-borns in the first years of life will be possible, through linkage with hospitalizations, pharmaceutical prescriptions and outpatient services. Moreover, this cohort is placed in a setting characterised by higher levels of air pollution, compared to North America or other European countries, and therefore, it will provide estimates of the impact on pregnancy outcomes of less investigated levels of exposure. In addition, the back extrapolated models will provide time window-specific estimates to account for the outcome-specific windows of vulnerability to environmental exposure during pregnancy.
We have observed little differences in both the determinants and the outcomes across cities, but they can be appropriately taken into account in the multivariable analysis. More interestingly, we have observed that the association between socioeconomic characteristics and air pollution exposure do not go in the same direction everywhere. This is in line with previous literature, underlining that often in big cities (as Rome), people with a high SES are highly exposed to air pollution, since they tend to live in the inner and most polluted areas of the city [21
]. On the contrary, in our smaller cities (such as Turin and Bologna), in addition to the rich central districts, there are many affluent neighbourhoods in the greenest hilly areas. Furthermore, exposure to vehicular traffic can also vary depending on the presence or absence of pedestrian areas in the centre of the cities included in the study. Therefore, results can be correctly interpreted only after a thorough analysis of the orographic-urbanistic characteristics of each city.
From our initial analysis however, we can draw a first picture that can explain some of the observed differences. The cities in the North (Turin, Modena, Reggio Emilia and Bologna) are located in the Po Valley, the most polluted area of Italy. For its orography, high air pollution concentrations, produced by several industries, in addition to heating and vehicular traffic, tend to stagnate in the valley. On the contrary, Rome is not an industrial city, it is not surrounded by mountains, and has a mild temperature, with breezes from the Tyrrhenian Sea. Given its characteristics, air pollution in Rome is mainly due to inner vehicular traffic. The differences we found in the distribution of high levels of exposure by SES between the two macro-areas might reflect the main sources of air pollution: in fact, in Rome, the source is mainly the same for all pollutants (traffic), while in the northern cities, PM10 can represent heating or industrial emissions, NO2 vehicular traffic, and PM2.5 a mixture of the two. The LUR methodology provides equations that depend on the measured concentrations and the city specificities. Therefore, in each city, the variables which enter the models can differ even in equations of similar pollutants, such as particulate matter (PM10, PM2.5, and PM2.5–10). In Turin, for example, PM2.5 is affected among the others by the traffic load on major roads, producing high concentrations along inner avenues, where high-class people live, whereas the PM10 equation includes also the population density, thus reducing the estimated concentrations in the central area of the city and accentuating its levels in the more deprived western area.
The geographical differences between the cities in the Po Valley and Rome suggest the need to perform city-specific analyses, and, in a comparative approach, to consider the city as an effect modifier of the association between air pollution exposure and health outcomes. Moreover, the diverse patterns of exposure by the different SES indicators also suggest a possible effect modification of the socioeconomic position on the impact of air pollution on pregnancy outcomes, which must be carefully considered in the future analyses. In summary, there are only two cases of consistency for the different SES indicators. First, we observed that for almost everywhere, the levels of all pollutants were higher among non-Italian women, suggesting a specific area of prevention intervention that might be soon undertaken. Secondly, maternal occupation had the most similar values of pollution in the two classes considered, and this might indicate the necessity of considering a more detailed classification of occupation, which was not possible in this study for the smaller cities.
The differences in exposure by season of conception are higher than spatial differences within cities (data not shown), thus suggesting that spatial variability of concentrations might be less important than temporal variability in city-specific models. As it has been suggested [22
], an accurate choice of time-windows of exposure is necessary for each single outcome, depending on the specific pollutant and on the hypothesised mechanisms of association. We will therefore perform sensitivity analyses to identify the best exposure assessment in studying the association between air pollution and pregnancy outcomes.
Our study has some limitations. Firstly, the birth cohort represents only three regions in the North and Centre of the country; being based on current information systems, however, the model is easily reproducible and transferable to other areas. Compared to hospital-based birth cohorts, we lack data from medical records and on lifestyles potentially related to the outcomes under examination, such as smoking, type of job or body mass index. The socioeconomic characteristics, however, being in turn associated with lifestyles, allow us to analyse the association between pollution and pregnancy outcomes, with a partial adjustment also for these risk factors. Moreover, although the success rate at linkage is overall high, as well as the completeness of the datasets, there is some variability across cities in the percentages of unlinked records and missing data. This needs to be further investigated in order to understand the mechanisms involved and avoid possible biases [23
There are also limitations regarding the exposure assessment. The approaches are completely comparable among cities only for NO2
, for which we have adopted exactly the same kind of model with the same protocol. Different approaches, with different degrees of reliability, have been applied for PMs in the cities of Emilia-Romagna, compared to Turin and Rome. Moreover, using the standard LUR models providing estimates for each maternal address, we assume that, besides the decreasing levels of air pollution concentrations in recent decades, the spatial contrasts of pollutants do not vary over time within the cities. This was the case for NO2
in Rome [24
] and in other settings [25
], but we do not have evidence for particulate matter, nor for the cities of the Po Valley. On the other side, applying the extrapolation techniques, the high values in the standard deviations of the extrapolated data suggest a high degree of uncertainty of the estimates. This procedure is at present one of the most innovative methodologies, agreed upon at the European level; it is aimed at defining exposure gradients that make data independent of external time trends and comparable over the entire observation period. Problems of potential overestimation (or underestimation) of the exposure related to the procedure are possible, but we standardized the methodology across cities in order to make this possible bias non-differential, therefore not affecting the point estimate of the risks.