Review Shows Lack of Evidence Supporting Use of
-
Loading metrics
The Role of Google Scholar in Evidence Reviews and Its Applicability to Greyness Literature Searching
- Neal Robert Haddaway,
- Alexandra Mary Collins,
- Deborah Coughlin,
- Stuart Kirk
10
- Published: September 17, 2015
- https://doi.org/10.1371/periodical.pone.0138237
Figures
Abstruse
Google Scholar (GS), a normally used web-based academic search engine, catalogues between two and 100 million records of both academic and grey literature (articles not formally published past commercial academic publishers). Google Scholar collates results from beyond the internet and is costless to use. As a result information technology has received considerable attending equally a method for searching for literature, particularly in searches for grayness literature, as required by systematic reviews. The reliance on GS as a standalone resource has been greatly debated, yet, and its efficacy in greyness literature searching has not yet been investigated. Using systematic review instance studies from ecology science, we investigated the utility of GS in systematic reviews and in searches for grey literature. Our findings testify that GS results incorporate moderate amounts of grayness literature, with the majority establish on average at page fourscore. We also found that, when searched for specifically, the majority of literature identified using Web of Science was besides found using GS. Yet, our findings showed moderate/poor overlap in results when similar search strings were used in Web of Science and GS (10–67%), and that GS missed some important literature in five of six case studies. Furthermore, a full general GS search failed to find any greyness literature from a example report that involved manual searching of organisations' websites. If used in systematic reviews for greyness literature, we recommend that searches of article titles focus on the beginning 200 to 300 results. We conclude that whilst Google Scholar tin notice much grey literature and specific, known studies, it should non be used lonely for systematic review searches. Rather, it forms a powerful addition to other traditional search methods. In addition, we abet the use of tools to transparently document and catalogue GS search results to maintain loftier levels of transparency and the ability to exist updated, critical to systematic reviews.
Citation: Haddaway NR, Collins AM, Coughlin D, Kirk Due south (2015) The Role of Google Scholar in Testify Reviews and Its Applicability to Grayness Literature Searching. PLoS Ane ten(9): e0138237. https://doi.org/ten.1371/journal.pone.0138237
Editor: K. Brad Wray, Country University of New York, Oswego, U.s.
Received: June 23, 2015; Accustomed: Baronial 26, 2015; Published: September 17, 2015
Copyright: © 2015 Haddaway et al. This is an open access commodity distributed nether the terms of the Creative Eatables Attribution License, which permits unrestricted employ, distribution, and reproduction in whatever medium, provided the original author and source are credited
Data Availability: All relevant information are within the paper and its Supporting Information files.
Funding: AMC acknowledges a Policy Placement Fellowship funded by the Natural Environment Research Council, the UK Department for Environment Nutrient and Rural Affairs and the Surround Bureau. Some ideas for this project were prompted by a forthcoming Defra inquiry projection (WT1552). NH was hosted at Bangor University (http://world wide web.bangor.air-conditioning.great britain/).
Competing interests: The authors have declared that no competing interests exist.
Introduction
Searching for data is an integral part of research. Over xi,500 journals are catalogued by Journal Citation Reports (http://thomsonreuters.com/periodical-commendation-reports/), and the book of published scientific research is growing at an ever-increasing rate [1,two]. Scientists must sift through this data to find relevant inquiry, and practice then today most normally by using online citation databases (due east.thou. Web of Science) and search engines (e.g. Google Scholar). Only as the number of academic articles and journals is steadily increasing, so too are the number of citation databases.
A citation database is a ready of citations that can be searched using an online tool, for example Web of Science (https://webofknowledge.com/). These databases typically charge subscription fees for access to the database that do non encompass the cost of access to the total text of the enquiry manufactures themselves. Generally these databases selectively catalogue citations according to a predefined list of journals, publishers or subject areas. Several costless-to-use services have recently appeared that search for citations on the cyberspace, nigh notably Google Scholar and Microsoft Academic Search. These search engines practice not shop citations within a specific database, instead they regularly 'crawl' the internet for information that appears to be a commendation. Some key characteristics of databases and search engines are compared in Table i.
Co-ordinate to Thomson Reuters, the Web of Science Cadre Collections citation database contains almost 50 one thousand thousand research records (http://wokinfo.com/citationconnection/realfacts/; February 2015), with Microsoft Academic Search reporting to catalogue in excess of 45 million records equally of Jan 2013 (http://bookish.research.microsoft.com/Most/aid.htm#9). Google Scholar does not report the volume of citations identifiable via their search facility, although attempts have been made to approximate this that suggest between 1.8 million [three] and 100 meg records [4] are identifiable.
"Grey literature" is the term given to depict documents not published by commercial publishers, and it may form a vital component of evidence reviews such equally systematic reviews and systematic maps [5], rapid testify assessments [six] and synopses [7]. Grey literature includes academic theses, organisation reports, government papers, etc. and may prove highly influential in syntheses, despite non being formally published in the same way as traditional academic literature e.1000. [8]. Considerable efforts are typically required inside systematic reviews to search for greyness literature in an try to include practitioner-held data and also account for possible publication bias [5,nine]. Publication bias is the tendency for significant, positive enquiry to be more likely to be published than non-significant or negative research, leading to an increased likelihood of overestimating result sizes in meta-analyses and other syntheses [10]. The inclusion of grey literature is a central tenet of systematic review methodology, which aims to include all available documented evidence and reduce susceptibility to bias.
Academic commendation databases are frequently the offset port of call for researchers looking for information. However, access to databases is often expensive; some costing c. £100,000 per annum for organisations of up to 100 employees. Increasingly, researchers are using academic citation search engines to find information (Haddaway, unpublished data). Academic commendation search engines appear to stand for an attractive alternative to plush citation databases, cataloguing research almost immediately and not restricting results to certain journals, publishers or subject categories. Search engines are peculiarly bonny to systematic reviewers, since they have the potential to exist used to search for grey literature rapidly and just using one search facility rather than a plethora of individual websites [5].
There is on-going debate regarding the utility of Google Scholar as an academic resource e.thousand. [11,12], merely too every bit a replacement for traditional bookish citation databases and in searches for grayness literature in systematic reviews [13,14]. Google Scholar represents an attractive resource for researchers, since information technology is free-to-utilize, appears to catalogue vast numbers of academic articles, allows citations to be exported individually, and also provides citation tracking (although see criticism of citation tracking by Delgado Lopez-Cozar et al. [xv]). Google Scholar is as well potentially useful in systematic reviews, since reliance on just one such platform for searches would: i) offer resource efficiency, ii) offering toll efficiency, iii) let rapid linking to full texts, four) provide access to a substantial body of grayness literature as well equally bookish literature, and 5) be uniform with new methods for downloading citations in bulk that would allow for a very transparent approach to searching [16].
Previous research has shown that articles identified within systematic reviews are identifiable using Google Scholar [13]. However, other authors have suggested that this does non make Google Scholar an appropriate replacement for bookish citation databases, every bit, in practice, there are considerable limitations in the search facility relative to those of academic databases [11], and there is on-going contend about Google Scholar'south place in research [12]. Shultz [17] listed many limitations that have been attributed to Google Scholar, including that the service permits apply of merely bones Boolean operators in search strings, which are limited to 256 characters, and that users cannot sort results (although some of the other cited disadvantages take been corrected in recent updates). Ii further limitations to the use of Google Scholar in academic searches are the inability to directly export results in bulk as citations (although a express number of individual citations can exist extracted inside a set time catamenia) and the display of just the first 1,000 search records with no details of the means by which they are ordered.
Web-based academic search engines, such as Google Scholar, are often used within secondary syntheses (i.e. literature reviews, meta-analyses and systematic reviews). Systematic reviews typically screen the first 50 to 100 search records inside Google Scholar eastward.yard. [eighteen,19,20], sometimes restricting searches to title rather than full-text searches eastward.g. [21]. Such activities are not themselves evidence-based, however. Little is known about how these results are ordered, or what proportion of search results are traditional academic relative to grey literature. Furthermore, this minor caste of screening (fifty to 100 records) is a very small proportion of the volume of literature found through other sources (frequently 10s of thousands of records).
Google Scholar has improved profoundly in recent iterations; axiomatic from early critiques of the service relative to academic citation databases that cite problems that no longer exist e.m. [22,23]. Whilst the debate on the usefulness of Google Scholar in bookish activities has connected in contempo years, some improvements to the service offering unequivocal utility; for case, Shariff et al. [24] found that Google Scholar provided admission to about three times as many articles gratis of charge than PubMed (14 and five%, respectively).
Whatsoever recommendations in systematic review guidance that are made regarding the allocation of greater resources to the use of bookish search engines, such equally Google Scholar, should be based on knowledge that such resources are worthwhile, and that academic search engines provide meaningful sources of evidence, and practise not correspond to wasted effort.
Here, nosotros describe a written report investigating the utilise of Google Scholar equally a source of research literature to help answer the following questions:
- What proportion of Google Scholar search results is academic literature and what proportion grey literature, and how does this vary betwixt different topics?
- How much overlap is there between the results obtained from Google Scholar and those obtained from Web of Science?
- What proportion of Google Scholar and Web of Science search results are duplicates and what causes this duplication?
- Are articles included in previous ecology systematic reviews identifiable by using Google Scholar alone?
- Is Google Scholar an constructive means of finding grayness literature relative to that identified from hand searches of organisational websites?
Methods
7 published systematic reviews were used equally case studies [20,25,26,27,28,29,30] (come across Table 2). These reviews were chosen as they covered a various range of topics in environmental management and conservation, and included interdisciplinary elements relevant to public health, social sciences and molecular biology. The importance and types of grayness literature vary between subjects, and a diversity of topics is necessary for whatsoever assessment of the utility of a grey literature search tool. The search strings used herein were either taken direct from the cord used in Google Scholar in each systematic review's methods or were based on the review'southward academic search string where Google Scholar was not originally searched. Search results in Google Scholar were performed both at "full text" (i.e. the entire total text of each document was searched for the specified terms) and "title" (i.e. only the title of each document was searched for the specified terms) level using the advanced search facility (see https://scholar.google.se/intl/en/scholar/assist.html#searching for further details). Searches included patents and citations. Since Google Scholar displays a maximum of one,000 search results this was the maximum number of citations that could be extracted using the especially developed method described below.
Table ii. Systematic reviews (SRs) used every bit example studies and their search strings (along with modifications to WoS search strings necessary to function in Google Scholar advanced search facility as indicated past strikethrough text).
Searches were performed on 06/02/15. Web of Scientific discipline includes the following databases as part of the MISTRA EviEM subscription; KCI-Korean Journal Database, SciELO Commendation Index and Web of Sciences Core Collection.
https://doi.org/10.1371/journal.pone.0138237.t002
1. What proportion of Google Scholar search results is grey literature?
A download manager (DownThemAll!; http://www.downthemall.net) and web-scraping program (Import.io; http://www.import.io) were used to download each page of search results (to a maximum of 100 pages; yard results) so extract citations equally patterned data from the locally stored HTML files into a database. 2 databases (1 for the title only search and ane for the total text search) for each of the vii systematic reviews were created, each property upwards to one,000 Google Scholar citations (see S1 File).
Exported citations were assessed and categorised by NRH and AMC as one of the following types of literature:
- 'Blackness'–peer-reviewed manufactures published in academic journals
- 'Book'–monographs or consummate books produced by commercial publishers
- 'Volume chapter'–capacity within books produced by commercial publishers
- 'Patent'–registered patents and patent applications with the U.s. Patent and Trademark Office (USPTO)
- 'Thesis'–dissertations from postgraduate degrees (master's and doctorates)
- 'Conference'–presentations, abstracts, posters and proceedings from conferences, workshops, meetings, congresses, symposia and colloquia
- 'Other'–all other literature that may or may not be peer-reviewed, including; reports, working papers, self-published books, etc.
- 'Unclear'–any search record that could not be categorised according to the to a higher place classification (cryptic citations were discussed by the reviewers and classed every bit 'unclear' if no consensus could be reached due to limited information).
Book chapters are a subcategory of books merely have been separated for boosted clarity. These categories accept been chosen because they reflect the blazon of information returned by Web of Scientific discipline ('black' literature) and Google Scholar (all literature). The categories also reverberate the emergent classifications that were possible based on information in the citations and any associated descriptions.
For each search type (title or full text) the proportion of literature types across the search results was summarised per page of results to assess the relative location of the types within the results.
2. How much overlap is there between Google Scholar and Web of Science?
For each of the vii systematic review case studies title and full text searches were performed in Google Scholar and Web of Scientific discipline (25/01/2015) and citation records extracted (all records for Spider web of Scientific discipline or the first 1,000 for Google Scholar). Full text search results were non extracted for SR4 since over 47,000 records were returned, which was deemed as well expansive for this cess. The search results were and so compared using the fuzzy duplicate identification add-in for Excel described below to investigate the degree of overlap between Spider web of Science and the start one,000 Google Scholar search results.
3. What proportion of Google Scholar and Web of Science search results are duplicates and what causes this duplication?
Duplicate records are multiple citations that refer to the aforementioned article. They are disadvantageous in search results since they practise not stand for truly unique records and require time and resources for processing. Duplicates besides lead to a false interpretation of the size of search results: depending on the level of duplication there may be a significant difference from the true size of search results. The fourteen databases from the 7 instance written report systematic reviews described higher up were screened for Google Scholar duplicates using the Excel Fuzzy Duplicate Finder add together-in (https://world wide web.ablebits.com/excel-find-similar/) set to detect up to ten graphic symbol differences between record titles. Potential duplicates were then manually assessed and reasons for duplication (e.g. spelling mistakes or grammatical differences) were recorded.
Searches were performed using Spider web of Science (using Bangor University'southward subscription consisting of Biological Abstracts, MEDLINE, SciELO, Web of Science Cadre Collections and Zoological Record) using the same 7 search strings used with the above instance studies in Google Scholar for topic words. The starting time 1,000 search results were extracted and assessed for duplicates on title using the Fuzzy Duplicate Finder equally described above. Search results were extracted for records ordered both by relevance and by publication engagement (newest first), with the exception of SR2, SR5 and SR7, where totals of 230, 1,058 and ane,071 records respectively (all returned) were obtained and extracted in full.
iv. Are manufactures included in previous environmental systematic reviews identifiable using Google Scholar?
In order to examine the coverage of Google Scholar in relation to studies included in environmental direction systematic reviews, the lists of included articles following full text cess were extracted from six reviews (iv SRs described in Tabular array two; SR1, SR4, SR5, SR6 and ii additional reviews; [eight,31]) and each tape's title was searched for using Google Scholar. The selection in Google Scholar to include citations was selected. Where titles were not institute immediately, quotation marks were used, followed past partial removal of the title where possible typographical errors or punctuation variations might crusade a tape non to be establish. Where records were identified as citations (i.due east. Google Scholar found a reference within the reference listing of another article) this was also recorded. In addition, references from the last lists of included commodity for three systematic reviews (SR1, SR4, SR6) were searched for in Web of Science every bit described for Google Scholar, higher up.
v. Is Google Scholar an effective ways of finding gray literature identified from paw searches of organisational websites?
For another systematic review search string (SR5, Table ii) the 84 articles that were identified during searches for greyness literature in the published review [28] from 16 organisational web sites (see S1 Table) were used to examination the ability of Google Scholar to find relevant grey literature using a single search string. The 84 manufactures were checked against the exported search results for both championship and full text searches in Google Scholar (run into Methods Section 1 to a higher place). The 84 articles were then screened in Google Scholar individually to appraise whether they were included in the search engine's coverage.
Results
1. What proportion of Google Scholar search results is grey literature
Betwixt 8 and 39% of full text search results from Google Scholar were classed as grey literature (hateful ± SD: xix% ± 11), and between 8 and 64% of title search results (xl% ± 17). Fig 1 displays search results by grey literature category, showing a greater percentage of grey literature than bookish literature in title search results (43.0%) than total text results (18.9%). Conference proceedings, theses and "other" grey literature (i.e. reports and white-papers) accounted for the increase in the proportion of grey literature in title searches relative to full text searches. Theses formed a particularly pocket-sized proportion of the full text search results across all case studies (i.three%), but formed a larger proportion of title search results (6.iv%). Similarly, briefing proceedings were less common in full text search results (3.2%) than championship search results (15.three%). The proportion of patents, book chapters and books was similar in full text and title searches (0.two and 0.3; i.7 and 2.5; 4.2 and ii.8% respectively).
When examining the location of literature categories across search results (see S1 Fig) several patterns emerge. "Acme" greyness literature content (i.e. the betoken at which the volume of grey literature per page of search results was at its highest and where the bulk of gray literature is found) occurred on average at page 80 (±15 (SD)) for full text results, whilst it occurred at page 35 (± 25 (SD)) for title results. Earlier these points in the search results grey literature content was low in relative terms. For the majority of the instance studies it was not until page 20 to xxx that grey literature formed a bulk of each page of search results.
2. How much overlap is there between Google Scholar and Spider web of Science?
Google Scholar demonstrated small overlap with Web of Scientific discipline title searches: this overlap ranged from 10 to 67% of the full results in Spider web of Scientific discipline (Table 3). The overlap was highly variable between subjects, with reviews on marine protected area efficacy and terrestrial protected surface area socioeconomic impacts demonstrating the lowest overlap (17.i and 10.three% respectively). Two case study championship searches returned more than the viewable limit of ane,000 search results in Google Scholar (SR1 and SR4) and so only the kickoff ane,000 could be extracted.
Table 3. Overlap between Web of Scientific discipline (WoS) and Google Scholar (GS) for championship searches in Web of Science and the first i,000 search results from title searches in Google Scholar.
See Tabular array two for case report explanations.
https://doi.org/10.1371/periodical.pone.0138237.t003
Full text search results from Google Scholar demonstrated low overlap with Web of Science results (Tabular array 4), ranging from 0.2 to nineteen.8% of the total Web of Science results.
Table 4. Overlap betwixt Spider web of Scientific discipline (WoS) and Google Scholar (GS) for topic give-and-take searches in Web of Science and the commencement 1,000 search results from full text searches in Google Scholar.
n/a corresponds to search results that were too voluminous to download in total. See Table ii for case written report explanations.
https://doi.org/ten.1371/periodical.pone.0138237.t004
3. What proportion of Google Scholar and Web of Science search results are duplicates and how do these duplicates come about?
Duplication rates (i.e. the pct of total results that are indistinguishable records) for Google Scholar and Web of Science are shown in Table 5 and range from 0.00 to 2.93%. Rates of duplication are essentially higher within Google Scholar than Web of Science, and rates are far higher in championship searches within Google Scholar than full text searches (Table 6), although this is quite variable between the 7 case studies (1.0 to 4.8%%).
Table 5. Duplication rates (proportion of total results that are duplicates) for Google Scholar and Web of Science for title-level, topic word and total text searches using 7 example study systematic review search strings.
Numbers in parentheses correspond to the standard deviations of the private example study duplication rates. Sample size refers to the number of search records in total, followed past the number of contained search strings (i.east. the number of example studies investigated).
https://doi.org/ten.1371/periodical.pone.0138237.t005
Table 6. Duplication rates (proportion of total results that are duplicates) in Google Scholar and Web of Scientific discipline searches across the vii case studies.
Duplication rates are assessed for up to 1,000 search records (or the full number where less than c. 1,300). For Web of Science the full text results were ordered past publication date (newest starting time) and relevance where more than than 1,000 results were returned. Numbers are duplication rate (%) followed by total search records in parentheses.
https://doi.org/10.1371/journal.pone.0138237.t006
Duplicates appear to have arisen for a range of reasons. First, typographical errors introduced by manual transcription were found in both Google Scholar (fifteen% of title records) and Web of Scientific discipline. For case, the sole example of a duplicate from Web of Science is that of the ii records that differ only in the spelling of the word 'Goukamma' (or Goukarmma) in the post-obit championship: "A change of the seaward boundary of Goukamma Marine Protected Expanse could increase conservation and fishery benefits". Differences in formatting and punctuation are a subset of typographical errors and corresponded to eighteen% of title level duplicates. Second, capitalisation causes duplication in Google Scholar, and was responsible for 36% of title level duplicates. 3rd, incomplete titles (i.e. some missing words) were responsible for 15% of title level duplicates. Fourth, automated text detection (i.due east. when scanning documents digitally) was responsible for 3% of title level duplicates. Fifth, Google Scholar besides scans for citations within references of selected included literature, and the presence of both these citations and the original articles themselves was responsible for xiii% of title level duplication.
iv. Are articles included in previous environmental systematic reviews identifiable using Google Scholar?
Many of the included manufactures from the six published systematic review case studies were identified when searching for those articles specifically in Google Scholar (Table seven). However, a significant proportion of studies in one review [31] were non found at all using Google Scholar (31.v%). Other reviews were better represented by Google Scholar coverage (94.3 to 100% of studies). Only i review had an included article list that was fully covered past Google Scholar, the review with the smallest evidence base of operations of merely 37 studies [31]. For those reviews where studies were not identified by Google Scholar, a further search was performed for these missing studies in Web of Scientific discipline (Tabular array vii), which demonstrated that some of these studies (6 studies from 2 case study reviews) were catalogued by Web of Science.
Table seven. The power of Google Scholar to observe included articles from vi published systematic reviews.
Records identified as citations are found only within reference lists of other manufactures (their existence is not verified by the presence of a publisher version or full text article, unlike hyperlinked citations).
https://doi.org/10.1371/journal.pone.0138237.t007
Google Scholar search results that were bachelor only as citations (i.e. obtained from the reference lists of other search results) constituted between 0 and xv.2% of identified results. Citations typically practise non atomic number 82 to web pages that provide additional information and cannot therefore be verified manually by users.
When searching specifically for private manufactures, Google Scholar catalogued a larger proportion of articles than Web of Science (% of total in Google Scholar / % of total in Web of Science: SR1, 98.three/96.vii; SR4, 94.iii/83.9; SR6, 99.4/89.7).
5. Is Google Scholar an effective ways of finding grayness literature identified from hand searches of organisational websites?
None of the 84 grey literature manufactures identified by SR5 [28] were found within the exported Google Scholar search results (68 total records from title searches and 1,000 of a total 49,700 records from full text searches). Nevertheless, when searched for specifically 61 of the 84 articles were identified by Google Scholar.
Discussion
This paper ready out to investigate the part of Google Scholar in searches for bookish and grey literature in systematic and other literature reviews. There is much interest in Google Scholar due to its gratis-to-apply interface, apparent comprehensiveness eastward.g. [11,12,13,14], and application within systematic reviews [xvi]. However, previous studies take disagreed on whether the service could be used as a standalone resource eastward.g. [11,12]. Our study enables recommendations to exist made for the use of Google Scholar in systematic searches for bookish and grey literature, particularly in systematic reviews.
1. What proportion of Google Scholar search results is grey literature?
Our results prove that Google Scholar is indeed a useful platform for searching for environmental science greyness literature that would benefit researchers such as systematic reviewers, agreeing with previous research in medicine [32,33]. Our investigations also demonstrate that more grey literature is returned in championship searches than full text searches (43% relative to 19%, respectively), slightly more than previously plant in an investigation of full text searching lonely in an early version of Google Scholar (13% of total results; [17]). The grey literature returned by Google Scholar may be seen by some as disadvantageous given its perceived lack of verification (through formal academic peer-review), particularly where researchers are looking for purely traditional academic evidence. However, this may be particularly useful for those seeking evidence from beyond academic and greyness literature domains; for example, those wishing to minimise the risk of publication bias (the over-representation of pregnant research in academic publications [34]).
Nosotros found that the greatest volume of grey literature in searches occurs at effectually page 35 for title searches. This finding indicates that researchers, including systematic reviewers, using Google Scholar as a source of grey literature should revise the electric current common practise of searching the start 50–100 results (5–10 pages) in favour of a more than extensive search that looks further into the records returned. Conversely, those wishing to use title searching for purely academic literature should focus on the start 300 results to reduce the proportion of gray literature in their search results.
The grey literature returned in the 7 systematic review example studies examined herein by and large consisted of "other" grey literature and conference proceedings; i.e. white papers and organisational reports. Reports and white papers may prove particularly useful for secondary syntheses, since they may often correspond resources that are deputed by policy and practice decision-makers. Briefing proceedings typically represent academic works that have not been formally published in commercial bookish journals: such manufactures may also provide useful bear witness for reviewers, specially systematic reviewers. Bookish theses were more mutual in title searches in Google Scholar, whilst books were more common in full text searches. Theses can provide a vital source of greyness literature [35], research that never makes it into the public domain through academic publications. It is worth noting that whilst academic peer-review is not a guarantee of rigour, research that has not been through formal academic peer-review should be advisedly appraised before beingness integrated into syntheses such as systematic reviews [5]. Google Scholar may thus prove to be a useful resource in addition to dedicated databases of theses (e.one thousand. DART-Europe; http://www.sprint-europe.eu/basic-search.php) and other gray literature repositories (east.g. ProceedingsFirst; https://world wide web.oclc.org/support/services/firstsearch/documentation/dbdetails/details/Proceeding.en.html).
2. How much overlap is there between Google Scholar and Web of Science?
Surprisingly, we found relatively little overlap between Google Scholar and Spider web of Science (10–67% of WoS results were returned using searches in Google Scholar using title searches). For the largest set of results (SR4) only 17% of WoS records were returned in the viewable results in Google Scholar (restricted to the first i,000 records). Yet, the actual number of returned results in Google Scholar was 4,310, with only the start ane,000 being viewable due to the limitations of Google Scholar. Bold an fifty-fifty distribution of overlapping studies across these results we might expect a small 73% coverage in full (calculated by applying a consistent rate of 17% from the beginning 1,000 to the total gear up of 4,310 search records). The limitations of viewable results in Google Scholar make an assessment of overlap impossible when the number of results is greater than 1,000. The instance report SR1 just slightly exceeded the viewable limit of 1,000 studies and identified an overlap of 38%, however.
The relatively low overlap between the 2 services demonstrates that Google Scholar is non a suitable replacement for traditional academic searches: although its results are greater than those in Web of Scientific discipline, the bulk of Web of Scientific discipline search results are not returned by Google Scholar. However, Google Scholar is a useful addition to traditional database searching, since a large body of search records was returned for each case study that did not overlap, potentially increasing the coverage of any multi-database search, such equally those carried out in systematic reviews.
three. What proportion of Google Scholar and Web of Scientific discipline search results are duplicates and how practice these duplicates come virtually?
Duplicates within commendation databases are disadvantageous because they correspond simulated records. Although the individual reference may be correct, its presence in the database contributes to the number of results. Where large numbers of references must exist screened manually, as in systematic reviews, duplicates may too represent a waste matter of resource where they are not automatically detectable. Duplication rates in Web of Science were very low (0–0.05%), but notably college in Google Scholar (1–5%). Duplication in Google Scholar occurred as a result of differences in formatting, punctuation, capitalisation, incomplete records, and mistakes during automated scanning and population of the search records. The sensitivity of Google Scholar searches comes at a cost, since identical records are identified as unique references. This may not exist a significant problem for modest-calibration searches, but a v% duplication rate represents a substantial waste matter of resource in a systematic review where tens of thousands of titles must be screened manually.
four. Are articles included in previous ecology systematic reviews identifiable using Google Scholar?
Gehano et al. [13] found that Google Scholar was able to identify all 738 articles from across 29 systematic reviews in medicine, and concluded that it could be used equally a standalone resource in systematic reviews, stating that "if the authors of the 29 systematic reviews had used only GS, no reference would accept been missed". As pointed out by other researchers e.g. [14], this conclusion is incorrect, since the power to detect specific, known references does not equate to an ability to return these references using a search strategy as might be conducted inside a systematic review: most importantly, the relevant articles may be returned outside of the viewable one,000 records. Giustini and Boulos [14] constitute that 5% of studies from a systematic review could not be identified using specific searches in Google Scholar, whilst Boeker et al. [11] found that up to 34% of studies from xiv systematic reviews were missed.
Google Scholar was able to notice much of the existing literature included within the systematic review case studies in our investigations, and indeed found more than Web of Science in the three case studies examined. As such, Google Scholar provides a powerful tool for identifying articles that are already known to exist (for example, when looking for a commendation or access to a total text document). In add-on, the search engine was as well able to identify big amounts of potentially relevant gray literature. However, some important show was not identified at all by Google Scholar (31.5% in one case study), meaning that the review may have come to a very different conclusion if it had relied solely on Google Scholar. Similarly, Spider web of Science lonely is bereft to place all relevant literature. As described in a higher place, Google Scholar may provide a useful source of prove in add-on to traditional academic databases, only it should not be used as a standalone resource in evidence-gathering exercises such every bit systematic reviews.
5. Is Google Scholar an effective means of finding grayness literature identified from hand searches of organisational websites?
Google Scholar was able to place a large proportion of the grey literature found in one example study through paw searching of organisational websites (61 of 84 articles). All the same, 23 articles could not be found using the search engine. Furthermore, the 61 articles found were not returned when using a typical systematic review-style search cord. Together, these factors demonstrate that Google Scholar is a useful resources in improver to paw searching of organisational websites, returning a large volume of potentially relevant information, but that information technology should not be used every bit a standalone resource for grey literature searching, since some vital data is missed. Mitt searching, as recommended by the Collaboration for Environmental Testify Guidelines in Systematic Reviews [5], is restricted merely to those websites included in an a priori protocol. Google Scholar exhaustively searches the internet for studies, however, and whilst it may be more fibroid than fine-level mitt searching (i.due east. missing studies), the add-on of a Google Scholar search targeting greyness literature would increase comprehensiveness without giving cause for concern with relation to whatever systematic bias. However, since the algorithms that order search results are non disclosed, a substantial proportion of search results should be examined.
Other Considerations
As mentioned higher up, only the commencement one,000 search results can exist viewed in Google Scholar, and the order in which results are returned is non disclosed. Furthermore, the 'advanced' search facility supports only very basic Boolean logic, accepting only one ready of 'OR' or 'AND' arguments, not both. In add-on, variations in the way that subscript and superscript text, for example with chemical symbols, are displayed and recognised mean that poor matching occurs during searches where these characters form function of article titles. Finally, Google Scholar has a low threshold for repetitive activeness that triggers an automated block to a user's IP address (in our experience the consign of approximately 180 citations or 180 individual searches). Thankfully this can be readily circumvented with the use of IP-mirroring software such as Hola (https://hola.org/), although care should be taken when systematically accessing Google Scholar to ensure the terms of use are non violated.
Conclusions
We have provided testify that Google Scholar is a powerful tool for finding specific literature, but that it cannot be a replacement for traditional academic citation databases, nor can it replace hand-searching for grey literature. The limitations of the number of search results displayed, the incomplete Boolean operation of the advanced search facility, and the non-disclosure of the algorithm by which search results are ordered mean that Google Scholar is not a transparent search facility. Moreover, the high proportion of grey literature that is missed by Google Scholar mean that it is not a viable alternative to paw searching for grey literature equally a stand up-solitary tool. Despite this, Google Scholar is able to place a large body of additional gray literature in excess of that found by either traditional bookish citation databases or grey literature identification methods. These factors make Google Scholar an attractive supplement to paw searching, further increasing comprehensiveness of searches for evidence.
We also note that the development of tools to take snapshots of search results from Google Scholar and extract these results as citations can significantly increment the efficiency and transparency of using Google Scholar (i.east. beyond the capricious start 50 search results currently favoured in many systematic reviews).
Several recommendations can be fabricated based on our findings for those wishing to use Google Scholar as a resource for research evidence:
- 1. Finding: Google Scholar is capable of identifying the majority of evidence in the systematic review case studies examined when searching specifically for known articles.
- Recommendation: Google Scholar is a powerful, free-to-use tool that tin be recommended if looking for specific research studies.
- ii. Finding: Google Scholar is non capable of identifying all relevant evidence identified in the systematic review case studies examined, missing some vital information (every bit did Web of Science).
- Recommendation: Google Scholar (and Web of Science) should not be used as standalone resources for finding bear witness as part of comprehensive searching activities, such equally systematic reviews.
- 3. Finding: Substantially more grey literature is found using title searches in Google Scholar than full text searches.
- Recommendation: If looking for greyness literature, reviewers should consider using title searches. If looking for academic literature title searches will yield a great deal of unsuitable information.
- iv. Finding: Title level searches yield more than conference proceedings, theses and 'other' greyness literature.
- Recommendation: Championship level searches may exist particularly useful in identifying as yet unpublished academic research grey literature besides as organisational reports and authorities papers [9]
- 5. Finding: The majority of grey literature begins to announced later approximately xx to 30 pages of results.
- Recommendation: If looking for grey literature the results should be screened well beyond the 20thursday page.
In summary, nosotros discover Google Scholar to be a useful supplement in searches for evidence, particularly grey literature so long as its limitations are recognised. We recommend that the arbitrary assessment of the first l search results from Google Scholar, frequently undertaken in systematic reviews, should be replaced with the practice of recording snapshots of all viewable search results: i.e. the first 1,000 records. This change in do could significantly improve both the transparency and coverage of systematic reviews, especially with respect to their gray literature components.
Supporting Information
S1 Fig. Google Scholar search results separated by literature blazon.
Search results by page for vii case studies (see Table 2 for descriptions), for a) full text and b) title searches. Results displayed are for the total number of extractable records in Google Scholar.
https://doi.org/10.1371/journal.pone.0138237.s001
(XLSX)
Acknowledgments
The authors wish to thank Helen Bayliss and Beth Hall for discussion of the topic. AMC acknowledges a Policy Placement Fellowship funded by the Natural Environment Research Council, the United kingdom of great britain and northern ireland Department for Environment Food and Rural Affairs and the Surround Bureau. Some ideas for this project were prompted by a forthcoming Defra inquiry project (WT1552).
Author Contributions
Conceived and designed the experiments: NH. Performed the experiments: NH AC. Analyzed the data: NH. Contributed reagents/materials/analysis tools: NH. Wrote the newspaper: NH Ac DC SK.
References
- i. Larsen PO, von Ins Chiliad. The rate of growth in scientific publication and the reject in coverage provided by Scientific discipline Citation Index. Scientometrics. 2010;84:575–603. pmid:20700371
- View Article
- PubMed/NCBI
- Google Scholar
- ii. Pautasso M. Publication Growth in Biological Sub-Fields: Patterns, Predictability and Sustainability. Sustainability. 2012;4:3234–3247.
- View Article
- Google Scholar
- 3. Noorden RV. Open up access: The true price of scientific discipline publishing. Nature. 2013;495:426–429. pmid:23538808
- View Article
- PubMed/NCBI
- Google Scholar
- 4. Khabsa M, Giles CL. The number of scholarly documents on the public web. PLOS ONE. 2014;9:e93949. pmid:24817403
- View Article
- PubMed/NCBI
- Google Scholar
- 5. Collaboration for Ecology Evidence (CEE). Guidelines for Systematic Review and Bear witness Synthesis in Ecology Management. Version 4.2. 2013. Ecology Evidence: www.environmentalevidence.org/Documents/Guidelines/Guidelines4.ii.pdf
- 6. Collins A, Miller J, Coughlin D, Kirk S. The Production of Quick Scoping Reviews and Rapid Bear witness Assessments: A How to Guide—Articulation H2o Show Group. 2014: Beta Version 2.
- vii. Conservation Show. Synopses Methods. 2015. Available: http://conservationevidence.com/site/folio?view=methods. Accessed 2015 February 24.
- 8. Bernes C, Carpenter SR, Gårdmark A, Larsson P, Persson 50, Skov C, et al. What is the influence on water quality in temperate eutrophic lakes of a reduction of planktivorous and benthivorous fish? A systematic review. Env Evid. 2015;2:ix.
- View Commodity
- Google Scholar
- nine. Haddaway NR, Bayliss Hour. Retrieving information for ecological syntheses: a gray area. Biol Cons. 2015.
- View Article
- Google Scholar
- 10. Jennions MD, Møller AP. Publication bias in ecology and evolution: an empirical assessment using the 'trim and fill' method. Biol Rev Camb Philos Soc. 2002;77:211–222. pmid:12056747
- View Commodity
- PubMed/NCBI
- Google Scholar
- eleven. Boeker M, Vach West, Motschall Eastward. Google Scholar as replacement for systematic literature searches: good relative recall and precision are non plenty. BMC Med Res Methodol. 2013;thirteen:131. pmid:24160679
- View Article
- PubMed/NCBI
- Google Scholar
- 12. De Winter JC, Zadpoor AA, Dodou D. The expansion of Google Scholar versus Web of Science: a longitudinal written report. Scientometrics 2014;98:1547–1565.
- View Article
- Google Scholar
- xiii. Gehanno JF, Rollin L, Darmoni S. Is the coverage of Google Scholar enough to be used lone for systematic reviews. BMC Med Inform Decis Mak. 2013;13:7. pmid:23302542
- View Article
- PubMed/NCBI
- Google Scholar
- fourteen. Giustini D, Kamel Boulos MN. Google Scholar is non enough to be used alone for systematic reviews. Online J Public Wellness Inform. 2013;5:214. pmid:23923099
- View Article
- PubMed/NCBI
- Google Scholar
- 15. Delgado López‐Cózar E, Robinson‐García N, Torres‐Salinas D. The Google Scholar Experiment: how to index false papers and manipulate bibliometric indicators. J Assoc Inf Sci Technol. 2014;65:446–454.
- View Article
- Google Scholar
- xvi. Haddaway NR. The Utilise of Web-scraping Software in Searching for Grey Literature. The Grey Journal. In press.
- 17. Shultz Grand. Comparing exam searches in PubMed and Google Scholar. Journal of the Medical Library Association: JMLA. 2007;95:442–445. pmid:17971893
- View Article
- PubMed/NCBI
- Google Scholar
- 18. Reed J, Deakin L, Sunderland T. What are 'Integrated Landscape Approaches' and how effectively take they been implemented in the tropics: a systematic map protocol. 2015;four:2.
- View Article
- Google Scholar
- 19. Hughes KM, Kaiser MJ, Jennings South, McConnaughey RA, Pitcher R, Hilborn R, et al. Investigating the effects of mobile bottom angling on benthic biota: a systematic review protocol. Env Evid. 2014; three:23.
- View Commodity
- Google Scholar
- 20. Roe D, Fancourt M, Sandbrook C, Sibanda Chiliad, Giuliani A, Gordon-Maclean A. Which components or attributes of biodiversity influence which dimensions of poverty. Env Evid. 2014;3:3.
- View Article
- Google Scholar
- 21. Garcia-Yi J, Lapikanonth T, Vionita H, Vu H, Yang S, Zhong Y, et al. What are the socio-economic impacts of genetically modified crops worldwide? A systematic map protocol. Env Evid. 2014;3:24.
- View Article
- Google Scholar
- 22. Jacso P. Every bit we may search-Comparison of major features of the Web of Science, Scopus, and Google Scholar citation-based and citation-enhanced databases. Curr Sci Bangalore. 2005:89;1537.
- View Article
- Google Scholar
- 23. Falagas ME, Pitsouni EI, Malietzis GA, Pappas G. Comparison of PubMed, Scopus, web of science, and Google scholar: strengths and weaknesses. FASEB. 2008;22:338–342.
- View Article
- Google Scholar
- 24. Shariff SZ, Bejaimal Lamentable, Sontrop JM, Iansavichus AV, Haynes RB, Weir MA, et al. Retrieving clinical evidence: a comparison of PubMed and Google Scholar for quick clinical searches. J Med Internet Res. 2013;fifteen:e164. pmid:23948488
- View Commodity
- PubMed/NCBI
- Google Scholar
- 25. Haddaway NR, Brunt A, Evans CD, Healey JR, Jones DL, Dalrymple SE, et al. Evaluating furnishings of state management on greenhouse gas fluxes and carbon balances in boreo-temperate lowland peatland systems. Env Evid. 2014;three:v.
- View Article
- Google Scholar
- 26. Savilaakso S, Garcia C, Garcia-Ulloa J, Ghazoul J, Groom Thou, Guariguata MR, et al. Systematic review of effects on biodiversity from oil palm production. Env Evid. 2014;3:1–21.
- View Commodity
- Google Scholar
- 27. Sciberras M, Jenkins SR, Kaiser MJ, Hawkins SJ, Pullin As. Evaluating the biological effectiveness of fully and partially protected marine areas. Env Evid. 2013;ii:ane–31.
- View Commodity
- Google Scholar
- 28. Pullin AS, Bangpan Chiliad, Dalrymple South, Dickson K, Haddaway NR, Healey JR, et al. Human well-being impacts of terrestrial protected areas. Env Evid. 2013;2:nineteen.
- View Article
- Google Scholar
- 29. Haddaway NR, Styles D, Pullin AS. Bear witness on the environmental impacts of farm state abandonment in high altitude/mountain regions: a systematic map. Env Evid. 2014;3:17.
- View Article
- Google Scholar
- 30. Whitlock R, Stewart GB, Goodman SJ, Piertney SB, Butlin RK, Pullin Equally, et al. A systematic review of phenotypic responses to between-population outbreeding. Env Evid. 2013;2:13.
- View Commodity
- Google Scholar
- 31. Bernes C, Bråthen KA, Forbes BC, Speed JDM, Moen J. What are the impacts of reindeer/caribou (Rangifer tarandus L.) on arctic and alpine vegetation? Env Evid. 2015;iv:four.
- View Article
- Google Scholar
- 32. Banks MA. The excitement of Google Scholar, the worry of Google Impress. Biomed Digit Libr. 2005;ii:2. pmid:15784147
- View Commodity
- PubMed/NCBI
- Google Scholar
- 33. Anders ME, Evans DP. Comparing of PubMed and Google Scholar literature searches. Respir Care. 2010;55:578–83. pmid:20420728
- View Article
- PubMed/NCBI
- Google Scholar
- 34. Gurevitch J, Hedges LV. Statistical issues in ecological meta-analyses. Ecology. 1999;80:1142–1149.
- View Article
- Google Scholar
- 35. Haddaway NR, Bayliss Hr. Shades of grey: two forms of grayness literature important for reviews in conservation. Biol Conserv.
- View Article
- Google Scholar
Source: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0138237
0 Response to "Review Shows Lack of Evidence Supporting Use of"
Postar um comentário