Tuesday, January 14, 2014

What is Self-plagiarism? (2)

As the discussion continues about what exactly constitutes self-plagiarism, acceptable re-use, and generally 'good publication ethics', it seems a good idea - in the hope of separating opinions from facts - to collect some more authoritative sources on the topic.
  • The Committee on Publication Ethics (COPE) always seems to me to care mostly about combating predatory publishing houses (and they do great work in that arena), but they have also set up some guidelines for authors. On the topic of self-publication they say

    Multiple publications arising from a single research project should be clearly identified as such and the primary publication should be referenced. Translations and adaptations for different audiences should be clearly identified as such, should acknowledge the original source, and should respect relevant copyright conventions and permission requirements. (art. 4.6)
  • Ron Ritzen pointed me to an editorial of The Lancet in 2009, which gives (in my opinion) a balanced view on self-plagiarism. The central statement is Deception is the key issue in all forms of self- plagiarism (...), meaning that whether a similarity constitutes self-plagiarism depends on whether deception is involved.
  • Although individual researchers should not be considered authoritative on a topic of self-plagiarism, Nijkamp himself is probably quickly gaining a vast amount of knowledge on this topic. Since he is also a central element in this discussion, it is interesting to note his defense. As regards the claim of self-plagiarism, the defense is organized around two arguments: first the constraints of language and topic (see below), and secondly the existence of 'halffabrikaten'. ('Halffabrikaten' refers to intermediate products as they are passed from one manufacturer to another, such as components or basic materials such as metal sheets or polymer pallets; is there a good english word for that?) In Nijkamp's defense, the 'halffabrikaten' are a common component of current publication culture, and they will have non-trivial overlap with other 'halffabrikaten' and with the 'final products' as a matter of course.
One question that I have not found a single answer to is the very mechanistic (and mathematical?) question
  • How does one distinguish, in concrete situations, between duplication that is the result of the limits of language and the constraints of the topic, and duplication that indicates a real copying of creative work?

    At one end of the spectrum, any two english articles - any two articles - will share at least 90% of their words. And: This sentence has been written before. I mean it: the sentence "This sentence has been written before" has been written before - Google alone gives five cases. Does my writing it here constitute plagiarism? I can honestly vouch that I didn't copy the sentence - I first wrote it (i.e. created it myself), then copy-pasted it into a Google search box, and indeed five hits came up. There are many obvious situations, like this, in which multi-word similarities are perfectly compatible with honest creation.

    At the other end, someone who copies a multi-page article lock, stock, and barrel (maybe even the biographical details? :-) can obviously not claim to have created themselves.

    (The remark about the bio has a barb to it - Frank van Harmelen pointed out that the Volkskrant list of Nijkamp's 'similarities' includes a biographical description of Nijkamp himself. Is it self-plagiarism to describe your own career twice in the same words? A better proof of the limitations of similarity-detection software is hard to give.)
As always, all comments are welcome.

5 Comments:

Blogger Tamar said...

Hi Mark,

What I miss inthe discussion in the newspapers so far is a clear distinction between conference papers and refereed publications. In conference papers everyone knows that the results are not original and has been, or will be published somewhere else. If you get many invitations For the same topic, I think recycling is unavoidable. Although I personally would not write a proceedings paper altogether. The newspapers do not mention whether the selfplagariasm has been found in proceedings or not. If inthe refereed literature I find it objectionable. Also because the copied paragraphs are from papers not always sharing the same coauthors.

Jacco

2:37 PM  
Blogger Tamar said...

Hi Mark,

What I miss inthe discussion in the newspapers so far is a clear distinction between conference papers and refereed publications. In conference papers everyone knows that the results are not original and has been, or will be published somewhere else. If you get many invitations For the same topic, I think recycling is unavoidable. Although I personally would not write a proceedings paper altogether. The newspapers do not mention whether the selfplagariasm has been found in proceedings or not. If inthe refereed literature I find it objectionable. Also because the copied paragraphs are from papers not always sharing the same coauthors.

Jacco

2:37 PM  
Blogger Pasttime (MRJHofstede@gmail.com) said...

The whole Nijkamp-self-plagiarism case is an example of initially negligent and afterwards misleading journalism. See my comment (in Dutch) at http://www.voxweb.nl/zelfplagiaat/. Instead of defending themselves scientists should take offence and demand an explanation of the newspaper involved.

2:34 AM  
Blogger gill1109 said...

What is going on in the Nijkamp case is something more pernicious. It is massive recycling of the same materials in different forms so as to bolster cv's and gain more funds to do more of the same. The same collection of authors have their names on multiple copies of the same work, but often one of the names is missing ... that person is then the editor of the conference proceedings volume, published by Springer and costing 100 Euro to buy. But actually the organisers of the workshop and editors of the conference volume are indeed authors of some of the articles in the volume. Almost the same articles turn up in a special issue of the Roumanian journal of transportation research and the Jagelonian University Journal of Regional University. Where one of the authors has a special guest chair. This is not self-plagiarism ... this is an ecosystem.

Of course, the next thing is that one should actually study the content of these papers. There is slow food, fast food .. and junk food. These papers belong at the junk food end of the spectrum. The data is recycled and much of it is clearly just made up.

A result is significant if p < 0.1.

Incredibly complex econometric models with a huge number of parameters are fitted to tiny data sets.

There is an unhealthy synergy with management advice bureaus, management gurus, who have part time jobs at universities and who boost the prestige of their management bureau by their academic credentials.

The NRC did very good and careful work. The Volkskrant jumped on the bandwagon and did some meaningless counting.

12:49 PM  
Blogger gill1109 said...

The Volkskrant indeed did not notice that there is such a thing as a "preprint". If a paper has five authors from five different research institutes then there will be five different "research memoranda". And perhaps one paper in a journal ...

In the old days, these were called "preprints" and they did not appear on your CV. Nowadays they are called "research memoranda" and they do appear on these authors' own lists of publications, e.g. on their personal Google Scholar home pages.

12:53 PM  

Post a Comment

<< Home