Wednesday, 1 April 2009

Open access licence: researcher opinion sought

A learned society has offered the Wellcome Trust an open access, author pays option for researchers who seek publication in their journal. However, the licence they wish to attach to these articles is more restrictive than the Trust would normally require when paying an OA fee.

The purpose of this posting is to seek opinion from the research community on whether these restrictions will, in any way, limit a researchers ability to re-use this content.

Summary of relevant licence conditions

The relevant section of the licence is shown below, in italics

PMC or UKPMC mirror site users may access, download, copy, display and redistribute articles, as well as text and data mine content in articles for non-commercial purposes only, subject to the following conditions:
  • In the case of text-mining, User may incorporate individual words, concepts and quotes up to 100 words per matching sentence, whereas longer paragraphs of text and images cannot be used.

  • Users may not create derivative works (as defined in the U.S. Copyright Act, 17 U.S.C. §101 et seq.) based upon the documents.
The Wellcome Trust is seeking input from the research community to help determine:

A) Whether such a licence would impact on your ability to re-use and re-purpose this open access content.

B) If so, please give some examples of research activities that would be limited by this licence.

If you would like to respond to this issue, please use the comment function below or send an email to r dot kiley at wellcome dot ac dot uk.


  1. This badly dilutes the concept of "open access", and it certainly isn't Open Access.

    The fact that they're offering the deal on these terms, however, does point to the fact that defenders of OA often focus on the "no subscription required to read the paper" angle and not so much on the "reproduce and remix" angle, which is (at least theoretically) a very important and powerful component of OA.

    Let's look at the definition of a derivative work:
    "A work consisting of editorial revisions, annotations, elaborations, or other modifications which, as a whole, represent an original work of authorship, is a “derivative work”."

    Where would a meta-analysis incorporating the results and analysis of a paper fall?

    Also, the conditions don't address data at all. Is data "words"? Would users be limited to republishing 100 data points?

    So does that mean that the data

    Non-use of images seems like a deal-breaker to me. If I publish a paper in an OA format, I want others to be able to put my graphs in their slides, post them on their blogs, etc.

    I certainly wouldn't publish under these conditions.

  2. I guess it is difficult to see what the WT pays for. Either they pay a small fee for some open access and exploitation or the publisher still owns too many rights.

    Almost every sentence is shorter than 100 words => complete sentences can be reused, which is a good thing.

    I think the license is good for indexing of the document (or individual sentences), which is still a big use case. Other information extraction, fact extraction and reuse of facts is not possible.

    Altogether: the fees paid by the WT should be rather in the low range. Apart from this, the license conditions sound reasonable.

    Could be that the publisher does not gain much by being restrictive.


  3. Apologies, the author pays and not the WT. Fees should be still low.

  4. The meaning of word "incorporate" in the text mining restriction in ambiguous. Does it mean that the text mining database cannot include more than 100 words per sentence in its database, or that results returned to a single query cannot include more than 100 words quoted per sentence? The addition of the "whereas" clause also is confusing: does this operate as an additional restriction on the preceding clause (i.e., whole paragraphs cannot be quoted even if less than 100 words long)? Or is it just an extraneous clarification? (There is a rule of construction in contract interpretation that courts should not interpret clauses in way that renders them merely extraneous).

    Also, if distribution of the whole article is permitted, then why have a limitation on results to a text mining query? There does not seem to be any reason for this other than to limit the utility of text mining software. Could a text mining engine simply display the whole article with hypertext links to the relevant results in it?

    In addition, the restriction against creating derivative works also substantially reduces the value and usefulness of the article, especially in modern, Web-based collaboration. Under 17 USC 101, a derivative work includes "editorial revisions, annotations, elaborations, or other modifications which, as a whole, represent an original work of authorship". The ability to create and share annotations (e.g., tagging) is an important aspect of modern research communities, especially involving collaboration over the Web. This limitation creates an unnecessary barrier to such network-based collaboration and to the development of technologies that can assist in research, such as Semantic Web and other emerging technologies.

  5. The whole point of the OA fee is to provide an alternative source of income for publishers so they don't have to restrict access to articles in order to recover their costs. If an OA fee is paid and the use of the article is still restricted, the OA fee becomes just a subsidy to the publisher and in fact distorts the publishing market. As others have pointed out, indexing and reuse are central to the benefits of open access. The Wellcome Trust should not agree to pay OA fees in any case where the license places restrictions on use.

  6. The only appropriate Open Access license is CC-BY

  7. The restrictions on text-mining would severely hamper efforts to build recommendation systems or any other kind of tools that would use a full-text index and set of inferencing or other tools to help scholars find content more efficiently. Fully or partially automated meta-analyses, citation analyses, and any other computer-aided text analyses would be illegal.

    What good is being able to read the full article if search engines and other systems designed to help us deal with info-glut can't help you find the full article?

