An estimate of the consensus project paper search coverage
Posted on 10 June 2013 by Ari Jokimäki
As part of my involvement in the consensus project (TCP) that recently published its results, I looked into some aspects of the data which were not part of the final paper. One thing I did was look into what proportion of the literature was covered by the project.
Here's the description of the search from the paper:
"In March 2012, we searched the ISI Web of Science for papers published from 1991–2011 using topic searches for ‘global warming’ or ‘global climate change’. Article type was restricted to ‘article’, excluding books, discussions, proceedings papers and other document types. The search was updated in May 2012 with papers added to the Web of Science up to that date."
This resulted in 12,465 papers, but after eliminating papers that were non-peer-reviewed, not related to climate, and papers without abstracts, the resulting number of papers was 11,944.
In order to check the completeness of the search, we should compare the search results to some other known sample. To me, the obvious sample for comparison is found from IPCC fourth assessment report (AR4) reference lists because I think they cover the subject reasonably well. However, it should be noted that IPCC reference lists don't contain all the papers on the subject, but they are only a subset just like the sample in TCP. The comparison between the TCP sample and IPCC reference lists presented below only shows if TCP paper search did not cover the subject well.
I made a cross-comparison between the TCP sample and the reference lists of AR4. I didn't go over all AR4 chapters, though, but only few selected ones. In TCP, papers were categorized to different subject areas, so I took equivalent chapters for each subject area from AR4 for comparison. For methods, I selected Working group I chapter 9 "Understanding and Attributing Climate Change". Obvious choice for paleoclimate was Working group I chapter 6 "Palaeoclimate". I used Working group II chapter 1 "Assessment of observed changes and responses in natural and managed systems" for impacts. For mitigation, I used Working group III chapter 7 "Industry" and chapter 8 "Agriculture" (two chapters in order to keep somewhat similar paper count to other comparisons).
The reference lists of AR4 chapters have some entries that are out of scope for TCP. Such entries are non-peer-reviewed documents (for example reports, websites, books), comment papers (comments and replies), and papers out of TCP timeframe (1991-2012). These were excluded from the comparison and from AR4 paper count. Table below shows the results of the comparison. Rows of the table are: category in TCP, equivalent AR4 chapter(s), number of relevant entries in AR4, number of papers found in TCP of the relevant AR4 entries, and the coverage percentage (and its standard error) of TCP paper search. Last column gives the overall estimate calculated by adding all the four different subject areas together.
|AR4 chapter||WG1 CH9||WG1 CH6||WG2 CH1||WG3 CH7+8||overall|
|Coverage %||11 ± 2||5 ± 1||12 ± 2||7 ± 2||8.7 ± 0.7|
The overall coverage percentage is estimated to be 8.7 %, which means that to acheive complete coverage we would have to have looked at 140,000 papers! The search found more papers from methods and impacts subject areas. At least for methods (containing basic climate science papers) this is expected, but it would make more sense if methods papers would have larger coverage percentage than impacts papers. But it may be that the authors of impacts papers are more inclined to mention the phenomenon causing the impacts they are studying, whereas for climate scientists it might go without saying that they are studying climate change related issues and they are perhaps more concentrated on studying the little details of the issue.
Somewhat surprising is low coverage percentage for paleoclimate papers. Perhaps they just don’t mention global climate change or global warming that much. Mitigation is a bit higher than paleoclimate, which might be understandable as mitigation is done because of global warming, so mitigation papers are perhaps expected to use the term more often than paleoclimate papers. It also makes some sense that impacts papers have higher coverage than mitigation papers, but I didn’t expect the gap between them to be this large (before seeing the numbers I considered them as somewhat equal in this sense).
The results of this comparison can also be used in other way. We can use the numbers above to estimate the total number of all published global climate change and global warming related papers between 1991 and 2012. The reasoning is this: TCP paper search found 8.7 % of papers referenced in AR4. Total paper count from TCP paper search is 11,944. If 11944 represents 8.7 % of papers, then total number of papers must be (11,944 * 100 %) / (8.7 %) = 136,693.
I calculated numbers similarly for all subject areas. Additionally, I estimated total numbers for endorsement and rejection papers between 1991 and 2012. Table below shows the results of these. Rows of the table are: category, number of papers in TCP in the category, coverage % calculated above, estimated total number of papers as described above, number of endorsement papers in TCP in the category, estimated total number of endorsement papers, number of rejection papers in TCP in the category, estimated total number of rejection papers.
|TCP paper count||1991||785||5780||3386||11942|
|Total paper count||18520||16364||48450||51274||136693|
As mentioned above, the estimate for all published papers relating to global climate change and global warming is about 137,000 between 1991 and 2012. Reading one of them each day would take 375 years. If you would be a climate scientist wanting to read all papers relating to global climate change and global warming (in order to keep up with the trade), and if your career would last, say, 50 years, you would need to read 7 papers each day. That’s doable, isn’t it?
As we all know, there are lots and lots more endorsement papers than rejection papers, but interesting thing here might be the total number of rejection papers, which I estimate to be 881. There are actually quite many of them. Perhaps one of them is the one that turns around the whole of climate science. I wouldn’t bet on it, though.