The Intergovernmental Panel on Climate Change (IPCC) has arguably the most wide-spun process of assessing the current state of climate change knowledge. As climate change, its impacts on society, and ways to prevent them is a trans-disciplinary effort, IPCC chapters should be scientifically precise, yet also understandable to the expert audience from a wide range of fields to ensure informed decision-making. It is a widespread belief that graphs are both an effective and efficient means for communicating scientific information [1
]: Graphs appear well-suited to render complex information easier to understand, to say “more than a thousand words”, and graphs are believed to condense information efficiently to save space. Graphs are also prevalent in the IPCC reports, often depicting key points and major results. However, the popularity of graphs in the IPCC reports contrasts with a neglect of empirical tests of their understandability. In fact, it has been argued before that communicating science requires the systematic feedback of empirical evaluation [2
]. Here we put the understandability of three graphs taken from the Health chapter (One of the authors of the present article (RS) was a lead author of an author team of 11 of the health chapter) of the Fifth Assessment Report [3
] to an empirical test. Specifically, we evaluate understandability in a sample of attendees of the United Nations Climate Change Conference in Marrakesh, 2016 (COP22), and a student sample by estimating: (i) Objective understanding: How well do recipients understand the messages conveyed by the graphs? (ii) Subjective understanding: How confident are recipients that they understood the message conveyed by the graphs? And (iii) Calibration: How well-aligned is recipients’ subjective confidence that they understood the graph with their actual, objective understanding?
The health chapter summarizes the direct and indirect impacts of climate change on human health and means for adaptation, and seems particularly apt to assess the understandability of the condensed scientific information conveyed in chapter graphs. This is because scientific evidence on climate change and human health has increased considerably since the last assessment report [4
], while at the same time, space constraints to communicate this novel evidence were tight: The whole chapter was allocated 30 pages including graphs and tables and 300 references. Chapter authors may therefore feel tempted to compress information into data-rich graphs [5
Two types of characteristics influence how well graphs are understood: top-down characteristics of the recipient viewing the graph, and bottom-up characteristics of the graph itself [6
]. Concerning characteristics of the graphs, graphs that display relatively small numbers of variables and data points tend to be easier to understand, because recipients are more likely to be able to attend to all the information displayed. In more complex graphs, in contrast, recipients typically need to select relevant information from a larger amount of information displayed [7
]. The health chapter graphs display a relatively large number of variables and data (Figure 1
). We therefore expect the graphs to be hard to comprehend.
Previous research showed that domain-specific knowledge (expertise), ability to reason with numbers (numeracy), and the ability to understand information presented graphically (graph literacy) are relevant characteristics of the recipient that affect graph comprehension. Specifically, expertise is known to facilitate graph comprehension. Expert meteorologists, for example, tend to cluster features from weather maps on the basis of meaningful causal relationships, whereas novices to meteorology cluster the information on the basis of surface similarity [8
]. Given that IPCC graphs are typically information-dense, we expect expertise in the area of climate change to facilitate their understanding. High numeracy is known to be less important for the comprehension of relatively simple graphs [9
], but facilities the comprehension of more complex graphs [10
], and should therefore influence comprehension in the present case of the relatively complex graphs. Graph literacy in turn was found to influence the comprehension of graphs even among highly numerate and well-educated people [11
]. We therefore expect participants’ ability to understand graphically presented information to influence graph comprehension above their numeracy skills.
Recipients of IPCC chapters should not only be able to understand the messages conveyed by the graphs. They should also be aware of those aspects they do not understand. Calibration of understanding refers to how well subjective confidence in one’s understanding aligns with actual understanding. Take two people who answer questions on IPCC graphs. Both of them are 80% confident in the accuracy of their answers. However only one of them, who indeed answered 80% of questions correctly, is well-calibrated. The other one who answered only 40% of questions correctly is overconfident. Crucially, research suggests that lacking insight into the accuracy of one’s judgment can impair the quality of subsequent decision-making. Physicians who are overconfident about their diagnosis being correct tend to prematurely narrow down the choice of diagnostic hypotheses, make more diagnostic error [12
], and request fewer additional tests [13
]. In the area of political-decision-making, it was found that bureaucrats who are overconfident in their expertise tend to choose more risk-taking policies [14
Prior research has shown that experts’ calibration—that is, the correspondence of their subjective accuracy with objective accuracy—varies widely. Experts in some areas are very well-calibrated in judgments related to their area of expertise. For example, the subjective and objective accuracy of meteorologists’ predictions are comparatively well-aligned ([15
]; see [16
], for similar results with bridge players). Other experts tend to be poorly calibrated (for example, physicians: [13
]; and lawyers: [17
]). What is largely unknown, however, is how well-calibrated expert audiences are in assessing their understanding of scientific climate change information. Here we estimate how well experts’ subjective confidence in their understanding of the graphs aligns with their objective understanding.
Attendees of the United Nations Framework Conference in Marrakesh, 2016 (COP22) answered questions on three graphs from the health chapter. This allowed us to gather data on how relevant audiences understand these graphs. The presented results should be considered pilot since our sample of COP22 attendees constitutes a relevant, but not representative sample of the entire audience of the IPCC reports. A lay-sample without particular interest or expertise in climate change answered the same questions for comparison, namely Heidelberg university mathematics students. They contrasted with the climate experts in that mathematics students (i) likely possess less top-down knowledge on climate change and health; but (ii) are likely to be highly numerate, and more used to reason with complex graphs. This comparison can help estimate the extent to which domain-specific knowledge versus proficiency to reason with information-dense graphs and numbers facilitate understanding of the IPCC health chapter graphs. Furthermore, the comparison can help estimate the degree to which experts compared to novices have an insight into which aspects of the health chapter graphs they do, and which they do not understand.
3.1. Objective and Subjective Understanding by Graph
Mean accuracy did not vary between the COP22 (M
= 0.33, SD
= 0.19) and the student sample (M
= 0.38, SD
= 0.21), t
(138) = −1.6, p
= 0.12. Descriptive results of subjective and objective understanding per graph are given in Table 2
, separate for the COP22, and the student sample. Since questions 4a and 4b test two aspects of one and the same graph feature (x-axis of Graph 3 displaying health cost effectiveness, Figure 1
), we additionally give the mean results of whether both aspects were answered correctly. Please note for the numerical question 4b, 10,000 would be the correct answer, but also 1000 was counted as correct to account for rounding off values at the x-axis.
3.2. Numeracy and Graph Literacy
Mean numeracy was lower for the COP22 (M = 0.48, SD = 0.40) than the student sample (M = 0.84, SD = 0.29), t(75.6) = −5.15, p < 0.001. Graph literacy, however, did not differ between the COP22 (M = 0.38, SD = 0.37) and the student sample (M = 0.47, SD = 0.30), t(100.3) = −1.31, p = 0.20. Numeracy was correlated to mean accuracy for the COP22 sample, r(44) = 0.36, p = 0.016, but unrelated for the student sample, r(55) = −0.15, p = 0.29. Graph literacy was unrelated to mean accuracy in both the COP22, r(55) = −0.05, p = 0.70, and the student sample, r(48) = 0.05, p = 0.75.
3.3. Calibration and Over-/Underconfidence
As is typically found, overconfidence was strongly and negatively associated with accuracy for both the COP22, r(57) = − 0.66, p < 0.001, and the student sample, r(81) = −0.53, p < 0.001, suggesting that overconfidence in one’s understanding increased as actual understanding decreased. Mean calibration (C-Index) did not differ between the COP22 (M = 0.17, SD = 0.16) and the student sample (M = 0.14, SD = 0.14), t(138) = 1.4, p = 0.16. The signed over-/underconfidence, however, was higher for the COP22 (M = 0.14, SD = 0.29) than the student sample (M = 0.04, SD = 0.25), t(138) = 1.99, p = 0.048. Moreover, subjective and objective understanding were unrelated for the COP22 sample, r(57) = 0.10, p = 0.45, but associated for the student sample, r(82) = 0.29, p = 0.009.
Decisions in the context of climate change and health need to be based on the best scientific evidence available in order to be most effective. Since the human health impacts of climate change generate a large and increasing number of scientific publications per year, however, it would be close to impossible to keep track. The IPCC regularly provides a unified scientific signal to communicate policy-relevant evidence. Here we assessed objective and subjective understanding of health chapter graphs in attendees of the COP22 and a sample of mathematics students. Evidence on how these graphs are interpreted seems necessary, both because large variation was found in how decision-makers interpret climate data in previous research [23
], and because appropriate development of data visualizations is fundamental to guide adaptation decisions.
With approximately 50% accuracy each, the COP22 sample could best understand Graph 1 (see Figure A1
) depicting health impacts of climate change that was newly developed for the health chapter, and Graph 2 (see Figure A1
) depicting the rather intuitive relationship between temperature and work output. The COP22 sample had the most difficulties understanding Graph 3 depicting health cost effectiveness of selected interventions. This graph’s x-axis displaying health cost effectiveness seemed to pose a particular barrier to understanding. Specifically, 22% of the sample were able to pinpoint the most health cost effective intervention, 26% were able to indicate which of two given measures is more health cost effective, and only 15% could read off by how much health cost effectiveness increases from one measure to another.
Results on the low understandability of Graph 3, particularly the scaling of its x-axis, are interesting for two reasons. First, they underline the importance of finding suitable and easy-to-understand means of communicating key findings. It seems hardly satisfying when only one quarter of recipients correctly reads off the main message meant to be conveyed by the graph. Second, the consistent pattern of errors that recipients made is revealing. When asked which of two given measures is more health cost effective, approximately three quarters of the sample (incorrectly) picked the one displayed to the right along the x-axis, rather than (correctly) picking the one to the left. This consistent mistake probably reflects recipients’ intuition that quantities increase from left to right, rather than from right to left. Similarly, when asked to indicate the factor by which health cost effectiveness increases or decreases from one of the measures to the other, particularly the COP22 sample (to a lesser degree: the student sample) seemed to be confused by the logarithmic scale. The most common answer was “2”, which would align approximately with the distance between those measures if the scale were linear (Figure 1
). These typical mistakes suggest that recipients had the too strong expectation of a linear increase from left to right when trying to make sense of the graph. It therefore seems advisable to design graphs in a way that exploits these expectations, rather than rely on overcoming them.
Although the present study provides potentially useful first results on the understandability of the IPCC health chapter graphs, some limitations of this study must be acknowledged and discussed. First, our sample of COP22 attendees does not constitute a representative sample of either the primary target audience (i.e., governments and policy-makers) or broader audiences (i.e., the scientific community, non-governmental organizations, the business sector, or the wider public). Although all of these groups were represented in the COP22 sample, it remains unclear whether the present results hold for representative samples for each of these audiences. For the very common mistake of assuming linearity, however, the present results might arguably not be subject to much variation since previous research has shown that linearity constitutes a very fundamental human intuition [24
]. Second, we did not track the number of people who were asked whether they wanted to take part during the conference, and therefore cannot give the response rate. Since people were approached by asking whether they wanted to take part in a study on understandability of IPCC graphs, it seems likely that though a self-selection effect our participants were interested in this topic. Future studies should therefore aim for representative samples of relevant audiences to estimate understanding of the graphs. Third, we did not incentivize correct answers which might undermine participants’ motivation to engage with the graphs. Although accuracy of understanding might not necessarily be incentivized in real-world situations, future studies should investigate the extent to which graph comprehension changes when accuracy is incentivized. In sum, the present pilot study does not allow to draw conclusions on the understandability of IPCC health graphs for a representative sample of the IPCC audience. It does, however, show the need to study understandability of IPCC graphs more in-depth in future studies.
Concerning characteristics of the respondents that influenced graph comprehension, we found a substantial association with numeracy (but not graph literacy) in the COP22 sample, in that more numerate attendees were better able to understand the graphs. This result underlines the rather scientific style of the health chapter graphs that make ample use of mathematical concepts such as geometry of circles (Graph 1), non-linear relationships, displayed on more than two axes (Graph 2), or logarithmic scaling (Graph 3). It is important to note that the use of complex, numerical elements is not specific to those graphs that were selected for the present research, but appears rather typical of the health chapter. The present results suggest that relying on strong numerical skills less could be an effective means to increase understanding of the key messages conveyed by the health chapter graphs.
Objective understanding was comparable for the COP22 and the student sample. Interestingly, though, the COP22 sample more strongly overestimated their actual understanding compared to the students. That is, COP22 attendees were more confident that they understood the graphs than justified by their actual understanding. Similarly, subjective confidence in understanding was associated with objective understanding for the student, but not the COP22 sample. These results suggest that COP22 attendees tended to have a feeling-of-understanding, even if they did not in fact understand the graphs, and that they lacked insight into which health chapter graphs they did, and which they did not understand.