Google’s Magic Carpet: the Value of ‘Gray Literature’
(This is the first in a series of posts exploring findings of HighWire’s researcher interviews. These interviews were conducted in 2012-2014, with over 60 researchers at Stanford and other leading institutions. The researchers were from all branches of scholarship.)
When we interviewed scholars about their “research workflow”, one of the first questions we asked was about discovery: how did they find the materials they needed to read on a research topic? The question was specifically about topical search, not ‘known-item’ search – how did they locate materials on a topic new to them, or check for new materials on a topic well-known to them?
At first we heard the answers we expected: they use Google Scholar, and often Web of Science. Depending on the field there were also search databases such as PubMed, SSRN or ADS mentioned.
The surprise to us was the frequency with which Google Web Search (i.e., google.com, not scholar.google.com) was mentioned. This was a surprise because these were scholars looking for scholarly content. So why were they looking in Google, when most all the research articles would be found in Scholar?
The answer came from one researcher who told us he was able to use Google “to clean around the edges of the carpet.”
A researcher told us he used Google “to clean around the edges of the carpet.”
What did that mean? we asked. And the answer told us something about the value of information sources that are not articles, and thus are not found in curated, primarily-article databases like Scholar, Web of Science, PubMed, SSRN and ADS.
The “edges of the carpet” referred to ‘gray literature’ such as presentations at scholarly meetings, course materials/lectures, slide decks, blog posts, journal clubs, videos and podcasts, lab pages, news and press reports, etc. that are authored by researchers or scholarly societies, but are not part of the formal journal literature.
Gray literature wasn’t necessarily the first source a researcher might choose, but researchers were telling us that to be complete they needed to look in Google, to go beyond the formalized database-ized journal literature. It was also possible (but we didn’t ask directly) that sometimes non-journal literature could be abetter beginning point for getting into a new topic.
To explain this to others, I suggested this thought experiment: imagine two researchers alone in a room – one the expert on a topic, and the other someone who needs to know about the topic. How would the expert convey the topic to the non-expert? It is unlikely that he/she would hand the non-expert a stack of journal articles and be silent. More likely, the expert would start speaking, draw some pictures (possibly ones that are in the stack of journal articles), invite and answer questions, etc. More like a classroom or scientific meeting than a quiet study hall.
Implications for publishers: understanding that a search includes more than the formal literature signals that researchers are looking for information in addition to articles about research advances. They are at times looking for synthesized and summarized information – like review articles, but not in journals – and they are looking for presentations/styles of material that are not following the strictures of article style – such as YouTube and Powerpoints. These ‘gray literature’ materials are often directly related to a specific article and might be written by the same authors to communicate in other venues than online journals. Publishers should consider how to connect these to and from the respective articles so that the article and the ‘non-article’ each benefit when someone finds one and would be aided by the other as well.
This connection role might be filled in a partial way by services that capture scholarly citations (e.g., DOIs, references) in the non-scholarly literature. But publishers might also want to provide an author service to enable the authors of the ‘gray’ works to attach them to their scholarly works as a type of author-privileged annotation. We have seen something like this developing in the article-comments tools in newspapers such as the New York Times, where comments from the author of a news piece will be badged so that readers can quickly spot “authoritative” commentary.