30 Aug 2010

What is the scientific paper? 2: What's wrong?

This is a guest post by Joe Dunckley
Once again, this is a re-post of something I wrote on my old blog a year ago after the Science Online conference, looking at the future of the scientific paper. As I reminded people at the time, these were just my own half-thought through ideas, not the policy or manifesto of anyone or anything I'm affiliated with.
So in response to the Science Online conference, we've been thinking about the question, "what is the scientific paper?" I already gave my answer to that a couple of weeks ago, but promised to have a go at answering the more interesting question, "what is wrong with the scientific paper?"
I've been thinking through how to sum up the answer all week, and I'm afraid the simple answer is, "the journal". The journal is what's wrong with the scientific paper. Or rather, the journal is what is holding back the development of efficient modern methods of disseminating science. So I thought I'd spend this second post making some observations on what the scientific journal traditionally is and does; what I think the modern journal shouldn't be doing; and a couple of case studies of alternative technologies that disseminate certain kinds of scientific communications better than a journal ever could.

What is the (traditional) scientific journal?
  • The journal is a collection of scientific papers limited to some kind of theme coherent enough to make it worth reading buying.
  • The journal is led by a charismatic editor-in-chief and editorial board who attract people to publish in the journal.
  • The journal is printed on pages. It can do text, still pictures, graphs, and small tables.
  • The journal publishes a sufficiently large number of papers to make it worth printing several issues each year, but a sufficiently small number of papers to make each issue manageable.
  • The purpose of the journal is to be read and cited by other scientists.
  • The purpose of the journal is to be purchased by university libraries.
  • The journal provides a peer-review, copy-editing, marketing and media relations service to their scientists.
  • Publishing in a journal provides a way for scientists to be cited and credited for their work, based on the reputation of that journal.
  • The journal decentralises scientific publishing, allowing individual pockets of innovation within the publishing world, but making change overall very slow.
What should the modern journal (not) be doing?
It is perhaps rather foolish for somebody who works for a publisher of journals -- who works developing technologies for a publisher of journals -- to say that the problem with publishing science is the journal. It would be even more foolish for me to say that publishers perhaps shouldn't be trying to fix the problem with technology. Here are a couple of interesting technological advances that the more forward thinking journals have come up with lately.
  • At Sci Online, Theo Bloom demonstrated iSee, a structural biology visualisation applet for your "supplementary information". In the same category is J. Cell Biol's DataViewer, which is presented to us as a device for visualising raw microscopy data. Did you know that the results that come out of modern microscopes are not just pretty static pictures, but vast datasets full of hidden information? The JCB DataViewer unlocks that hidden information, by providing it and an interface to it as "supplementary information" with a paper.
  • PLoS Currents: all the constraints and benefits of a traditional journal, but without the peer-review. Solves the problem of delays in publication. Publishes items that look just like the traditional paper.
Should publishers and journals be doing these things? When you look more closely at JCB's DataViewer, you find that, useful though it may be, most of its power and potential is currently wasted. The DataViewer is presented to us as a device for visualising the supplementary information of a paper; in fact, it is a potentially important database of microscopy datasets with a handy graphical interface attached. Restricted to a single journal, the database functionality lays unused.
PLoS Currents? This is supposed to be a solution to the problem of delays in publishing special types of science deemed to be important and timely enough to need rapid communication to peers in the field. What have PLoS done? What makes PLoS Currents unique? How does it speed up intra-field communication of those important results? It drops one single aspect of the paper: peer review. In all other respects, PLoS Currents does all it can to make its papers look like the scientific paper, and its "journal" look like the scientific journal. Scientists are still asked to spend hours writing up these important timely results, with an abstract, introduction, methods, results, conclusions and references, with select figures and graphs and tables. Nobody has the imagination to go beyond the paper-journal-publisher model. We would sooner give up peer review than publish science in anything that doesn't look like papers have looked for a century.
Or how about Journal of Visualised Experiments? JOVE is, for some inexplicable reason, held up as a brilliant example of innovation in publishing science -- of making the most of the new technology provided by the web. Those who point out that, well, it's not really a "journal", is it?, are chastised for their own lack of imagination. But surely it's those who can't conceive of a publishing format branded as anything other than the "Journal of ..." who are lacking the imagination.
Final example: while thinking about this post, PLoS Computational Biology kindly came up with the absurd idea of being a software repository. NO! Software repositories already make perfectly good software repositories, and there are plenty of them. Trying to turn a journal into a software repository is a suboptimal solution to a problem that disappeared long ago -- long before scientific publishers could have imagined that the problem even existed.
Breaking out of the journal
The web makes all sorts of new methods of publishing, communicating, disseminating science possible. It also comes with all sorts of well developed and widely used solutions to the problems of disseminating science. The big old publishers haven't even realised the web has happened, let alone thought about what to do with it. The hip young publishers know what's possible, and they want to be the ones to realise the possibilities. Good on the hip young publishers. But with each new possibility, scientists should be asking whether publishers, even the hip young ones, are really right for the job. Sometimes they are. Sometimes not.
GenBank, the database of gene sequences and genome projects, had to happen. Journals simply can't publish the raw results from a whole genome sequencing project. (Thought I don't suppose they gave up without trying.) And GenBank comes with dozens of benefits that papers, when spread across a decentralised system of journals, just can't have. Yes, I know that databases aren't the optimal solution for every variety of data, but they are suitable -- desirable; even required -- for more of them than you might think. The microscopy data in JCB dataviewer (or the structural data in iSee) would, I suspect, be of much greater value were it branded as a standalone public database with a fancy front-end, than as a fancy visualisation applet for some scattered and hidden supplementary files, restricted to a single journal.
Like it or not, science increasingly depends on data being published in public machine readable formats. Those who spend their days looking one-at-a-time at the elements of a single cell signalling pathway in every tumour cell line available to them are wasting our money if they bury their data in a fragmented and closed publication record. Nobody reads those papers, and the individual fragments of data don't tell us anything. Journal publishers think they can ensure that data is correctly published, but so far their only great successes are with the likes of GenBank and MIAME, where journals have ensured that data be deposited in public databases outside of the journal format.
ArXiV. Does this need any explanation? What does PLoS Currents offer that isn't already solved better by pre-print servers? Just a brand name that makes it look as though it's a journal. If you require rapid dissemination of important timely results and you want to go to the effort of writing a full traditional scientific paper, put it on a pre-print server while it's going through peer review in a real journal. Don't just abandon peer review while making it look like you've just published a real paper in a real journal.
Better yet, don't write a proper traditional paper. If you need rapid communication of important timely results, why waste time with all of the irrelevant trimmings of a scientific paper? The in-depth background and discussion and that list of a hundred references. Put these critical results on a blog with a few lines of explanation, and later submit the full paper for peer review in a real journal.
Credit where it's due
All the real scientists reading -- the ones looking for jobs and grants and promotion and tenure -- have spotted the one great big flaw in all these suggestions: credit. At least a paper in PLoS Currents can be listed in a CV. Nobody even reads blogs, let alone cites them. How can you get a grant on the back of a blog post? Am I suggesting you should be able to get a grant on the back of a blog post?
Maybe. I don't know. I don't think so. At the moment, publishing papers in journals is pretty much all a researcher can get any credit for. Asking researchers to go beyond the paper-in-journal format is going to create problems of assigning credit, and I don't know exactly what the solution to that problem might be. Simply, I haven't put much effort into considering solutions. I'm a consumer rather than creator of science, so that particular problem doesn't keep me awake at night. But there surely are solutions -- plenty of them.
Fact is, it's quite obvious to anyone in or observing science that the current method of ensuring that scientists are credited for their hard work is really quite broken. Trying to cram every new kind of "stuff" into that broken system is hardly helping.
Business models
Meanwhile, the publishers will be asking how we see the business models for these non-journal based methods of publishing working. Frankly, I'm not really interested. But then, JOVE is hardly the beacon of business success anyway. If publishers want science publishing to be a business, they need to find the new business models that work without strangling science. Otherwise, they're liable to find out that, on the web, some institutions and individual scientists can do a better job of disseminating science than the professionals can, and out of their own pocket.
The paper of the future
I don't necessarily think that anybody should stop writing papers -- perhaps not even the ones that nobody reads. The paper solves several problems better than any other proposed solution. A peer reviewed scientific paper, in a journal if you like, is as good a way as any to provide a permanent record of a unit of science done, and of a research group's interpretation of the significance of that unit of science. And it needn't change all that much. Making them shorter and a lot less waffley would be to my taste -- there's no need to put that much effort into words that won't be read. And give them semantic markup, animations, and comment threads, if you like. But don't pretend that those things are anything more than incremental advances. The real revolutions in the dissemination of science can only occur beyond the shackles of the traditional paper and journal. Every new Journal of Stuff is another step back.
Updates for 2010
Peter Murray Rust has been saying interesting things about domain-specific data repositories, which I am sure are worth paying more attention to than I have yet had time to.
When I originally posted this, I was challenged for not mentioning the problem of closed-access journals at all; that problem is addressed in the subsequent posts.

No comments: