Which Metrics Matter?
So much is being written about metrics that I’m loathe to add yet another post to the pile. But this will be a simple and short story.
When HighWire interviewed Stanford authors who had published in PLOS One, among our questions was “what metrics do you pay attention to?” (PLOS One was an early adopter of copious ‘altmetrics’, hence the question.) We got bi-polar answers:
- “I pay attention to everything.”
- “Only citations matter.”
When an interviewer gets definitive but opposite answers, s/he knows there’s more to the story — like magnets sometimes attracting, and sometimes repelling, literally ‘bi-polar’. Just as with magnets the solution to our metrics puzzle was simple, and turned out to be dependent on the rule our interviewee assumed in answering our question:
- Author: “I pay attention to everything.”
- Reader: “Only citations matter.”
That is, as an author, I care about Twitter, Facebook, blogs, popular news, etc. But as a reader looking to select readings from a large search-result list, I pay attention to “only citations”.
All authors told us they view Article-level and altmetrics of their articles, but one added, “It’s totally narcissistic. It’s horrible. I love it.”
ALMs to academics means both “article-level metrics” — meaning things like downloads and citations — as well as “alternative metrics” — having (typically) more to do with social-media such as blog posts, tweets, Facebook mentions, etc. The latter are often seen as “popularity” metrics, particularly consumer popularity, rather than scholarly metrics. But scholarly metrics such as downloads and citations are also about “popularity”. Article-level metrics are showing attention to the article itself, while the altmetrics might less-frequently drive attention to the article object, and more often just register awareness of an article about a particular topic. Awareness is no small thing, when there is so much competition for attention.
Readers said that ALMs were only one factor in assessment, and a bit less than half used ALMs to evaluate articles. Most seemed well aware of the limitations of metrics. One was clear that ALMs were a shortcut, or a crutch: “It provides me a really easy dumb way to evaluate the quality of the research without actually evaluating the quality of the research. It’s a trap.” This ‘trap’ extends beyond using the ALMs of a particular paper to evaluate that paper, on to the evaluation of specific articles based on the Journal Impact Factor (JIF), which stands in as a metric for “journal brand” or position of the journal in a hierarchy. Everyone seems to recognize that this is not valid, yet it is widely done:
‘In academia, journal brand is everything. I have sat in many committees, read many CVs, and participated in many discussions where candidates for a postdoctoral position, a fellowship, or other roles at various rungs of the academic career ladder have been compared. And very often, the committee members will say something along the lines of “Well, Candidate X has got much better publications than Candidate Y”…without ever having read the papers of either candidate. The judgment of quality is lazily “outsourced” to the brand-name of the journal.’ (from Addicted to the Brand: The hypocrisy of a publishing academic in The Impact Blog of the London School of Economics)
Among our interviewees there was certainly dissent about the value of article-level and alternative metrics for some purposes:
- “I care more about recent, than popular.”
- “ALMs are useful in megajournals because the criteria for acceptance are so low.”
This appropriate use of ALMs for certain purposes and not for others seems relatively less discussed, compared to the more frequent observation that the use of a JIF as a proxy for the quality of an individual article is a misuse. There seems to be a lot of interest in figuring out what ALMs are related to — there are occasional articles in the literature showing that this or that metric is related to citations or something else (and some of these are likely right!). But as one interviewee above noted, some ALMs might just show popularity — citations, usage, tweets — of various types in some populations. Many have noted that research articles about chocolate seem to spike in Twitter around Valentine’s Day.
From the standpoint of design of the reader’s User Experience, it seems clear that readers would like to know citation counts on pages where they are making decisions to read — e.g., on search result pages, as with Google Scholar — and making decisions to read further — e.g., abstract pages, as with PLOS journals. As one interviewee said, “I find articles in databases. I’ve already decided to read before I could see the ALMs.”
From the standpoint of design of the author’s experience, it seems clear that the tab on many sites that displays a page of copious article-level and alternative metrics should give authors the details they seek. Is this just vanity? Wise advice to all of us who build data bases with individuals’ names: “Never bet against vanity.” But I think there are at least two additional things going on with authors’ interest in altmetrics:
- “Catalog”: In many domains, authors have to report on their impacts, and altmetrics is certainly one avenue to assess and report on that.
- “Calibrate”: It is hard to know what to make of some numbers until I know where I fit along a spectrum. Some measures are easy to calibrate (height and weight, e.g.), while others really need some reference (BMI and credit scores, e.g.). While researchers can calibrate citation counts, tweets is another matter.
For this reason it is helpful when metrics services give percent or quartile references for a score; e.g., “this article has 4 tweets, which places it in the top 20% of similar articles.”
So the simple answer to the question “What metrics matter?” is “It depends to whom.” But we also know from work done by others that additional answers come from “It depends for what“: the purposes to which some metrics can be put are subject to argument, interpretation and, we hope, evidence.