With this issue, we inaugurate a new service within the Journal of Vision: download reports for individual articles. Citation counts have long been an important measure of the interest in, and impact of, an article. Online journals offer an additional valuable metric: how often an article has been downloaded.
Web servers typically retain a record of each transaction they conduct with users on the Internet. These are called web logs. At the Journal of Vision, we have preserved our web logs since October 23, 2003. The contents of each record vary with server software and configuration but usually include the date and time of the transaction, the file requested, and information about the client making the request. In addition to web logs, servers and browsers may record and exchange additional information about user requests.
This information may be analyzed in various ways to measure usage of a given article. The approach we have taken is to record, for each article, the date and time at which each unique user downloads the PDF article for the first time. In this way, we exclude multiple downloads by a single user. We call these “unique downloads.” From this record, we can construct what we call the trace of an article: the graph of cumulative unique downloads as a function of days since publication. An example is shown in
Figure 1.
This trace is itself an informative graphic, and we now provide it for each published article. It can be accessed either from the home page for each article, under “Downloads,” or from the new “
Download Reports” page that is reached from the “Information” link on the upper right of each page.
We have observed that the trace is similar in form for all papers. As in the example in
Figure 1, the cumulative number of unique downloads grows rapidly, then settles down to a lower and nearly constant rate. These two phases can be characterized by an initial rate and a current rate. The time at which the rate changes is not precisely defined but is somewhere in the first 3 months, and we define this “corner” to be at 90 days. We define InitialRate as the rate within the first interval extending up to the corner and CurrentRate as the rate within a final interval of 30 days.
Another metric of interest in individual papers is total unique downloads, and we provide this measure for each paper. However, because papers continue to be downloaded indefinitely, this measure favors older papers. For this reason, it is useful to construct a measure that reflects the demand for each paper, independent of its age. To do so, we again note that the trace is similar in form for all papers. As can be seen in
Figure 1, there is a rapid initial rise, which slows to a steady, nearly linear pace. We capture this second phase by fitting a polynomial of order 2 to the trace from the corner to its end point. An example of this fit is shown by the red curve in
Figure 1. From this curve, we estimate the overall rate of downloads per day within the 1,000 days following publication. We call this the DemandFactor. A table of the top 20 papers based on this metric is given in
Table 1. It is also provided on the
Download Reports page at the journal.
We note that DemandFactor cannot be estimated accurately for papers published before the start of our logs, because we have no record of the early phase of their trace. Accordingly, those PaperRanks are reported with a “>” symbol, because the estimated value is a lower bound. Remarkably, even with this handicap, two papers make it into the top five! We do not compute InitialRate for papers less than 90 days old, and we do not compute DemandFactor for papers less than 365 days old. We do not compute current rate for papers that are less than 120 days old.
To enable viewing of download reports, we have made a few small modifications to the interface to the journal. On the abstract page associated with each article, under the heading “Downloads,” we provide the total downloads and DemandFactor metrics for the article, as well as a link to the trace graph, and to further documentation. On a new page titled Download Reports, we provide tables of the top 20 papers sorted by DemandFactor or by total downloads, as well as a browsing interface to find the results for any single paper. This page also provides some additional details on our methods. This Download Reports page is reached from the set of navigation links at the upper right of each page.
Counts of citations of an article by other articles is a traditional method of evaluating the so-called impact of an article (E. Garfield,
1955,
2006). DemandFactor (and other measures of online usage) is a different and in some respects a complementary measure. First, DemandFactor reflects interest in an article before it has been read, while citation counts reflect impact of an article after it has been read. Second, because it takes time for articles to be cited, citation counts are not informative until several years after publication. For example, “impact factors” for journals are typically released as much as 3.5 years after the first cited article is published (H.F. Moed,
2006). In contrast, the DemandFactor metric is available in as little as a year, and the InitialRate metric is available in as little as 90 days. Second, articles may be read by other scientists and provide useful background information but may not be sufficiently relevant to lead to a citation. In this regard, it is well known that different disciplines differ widely in their citation practices (H.F. Moed,
2006). Finally, scientific articles would be of little value to society if they led only to the production of other articles. But of course, articles are put to use in other ways, by practitioners, engineers, or the larger public. Download statistics may reflect these wider uses.
What does DemandFactor measure? As with counts of citations, this is not a simple question. We can be reasonably confident that it is not simply a measure of “quality.” Papers may be of high quality but may be rarely downloaded or read. Likewise, papers may be frequently downloaded for reasons that are orthogonal to quality. For example, we have observed that papers with intriguing graphics and papers that interest a community outside of vision science specialists may be downloaded more frequently than average. Furthermore, papers may experience a surge in interest long after the 1,000-day mark that we use to compute DemandFactor. However, this so-called “Mendel effect” has been shown to be a very rare event (H.F. Moed,
2006). Despite these caveats and shortcomings, we believe that, like citation counts, download counts are an additional, useful source of information about articles. In our new, networked age, we are learning that links among pieces of information, and the number of links leading to one item, are valuable tools in searching and sorting among a daunting universe of possibilities. Download counts are another example of such a tool.
We should caution the reader that the specific statistics we have chosen to compute are not the only ones we could have chosen. Indeed, we may, in the light of experience, choose different statistics. Further, our statistics suffer from shortcomings as noted above. Nevertheless, we feel that they are a useful starting point in a new era of dynamic meta-information about articles.
So far as we are aware, the Journal of Vision is the first publication to make current usage statistics available for individual articles. We believe that this is useful information, for both authors and readers. The Journal of Vision will continue to innovate and evolve. This is one more small step in that process.