The creativity of academic cheaters will amaze you: Mario Biagioli, a UCLA Distinguished Professor of Law and Communication as well as a 2019-20 CASBS fellow, chats with Host John Markoff about the history and recent trends of fraud and gaming in scholarly publishing.
Mario's article in the Los Angeles Review of Books, "Fraud by Numbers: Metrics and the New Academic Midconduct"
Mario’s book “Gaming The Metrics: Misconduct and Manipulation in Academic Research”
2019-20 CASBS fellow Brian Arthur’s paper “All Systems Will Be Gamed: Exploitative Behavior in Economic and Social Systems”
Narrator: From the Center for Advanced Study in the Behavioral Sciences at Stanford University, this is Human Centered. Today on Human Centered, Host John Markoff chats with Mario Biagioli, distinguished professor of law and communication at UCLA. His work focuses on how the increasing reliance on metrics to evaluate scholarly publications has produced new forms of academic misconduct. The two discuss the origin of his studies, the history of scholarly publishing, recent trends in gaming metric-based systems, and whether we can address these issues.
John Markoff: Let me start by asking you what your path was from the study of scientific innovation to the study of scientific fraud.
Mario Biagioli: So now it's a long time ago, in the '90s, I became interested in the problems of scientific authorship. So this is the time when you start finding, seeing articles with hundreds of authors. So a lot of people, both scientists and, you know, university administrators, you know, they really didn't quite know how to attach credit to those articles. You know, I mean, what does it mean to be an author together with 500 other authors, right? So And then there were also discussions about, you know, how do you assign responsibility, right? Because if there are 500 authors and it turns out that the piece was fraudulent, there are plenty of authors are going to say, "Oh, I didn't know, you know, Joe did that," right? So there were, you know, really a lot of discussions about both credit and responsibility in in a context of multi-authorship, massive, massive multi-authorship. So I, for several years, I was really working on that. And I edited a book with a colleague, with Peter Gallison, and then I just let it, you know, I moved on to something else. And then I got interested in the development of metrics, which really was not on the radar screen in the '90s. And I start seeing that there were all these new problems connected to authorship and fraud that really did not exist in the '90s. So that was the puzzle that kind of drew me in.
John Markoff: It's funny when you talk about intellectual credit and responsibility because when I first arrived at the New York Times in the 1980s, It took an exception from the masthead, from the editors of the paper, to have more than a single byline. And it was specifically for the reasons you enumerate. They wanted to know who was responsible if you had several bylines. And then gradually that broke down, and now we have many bylines. And I never sort of tracked the cultural transition, but it happened while I was at the paper.
Mario Biagioli: Yeah, yeah. I mean, you know, in some— people still complain. So this is at the level of credit, not responsibility, but so People complain that the Nobel Prize is completely anachronistic, right? You know, because they have a limit. I don't remember how many, like 4 people, 3 people. And so a lot of people say, you know, that's completely, that misrepresents the nature of science today. But you are right that probably in the case of the New York Times, probably they were concerned also about responsibility. You know, that if you add people, you know, and in most fraud cases, that's the standard scenario, right? The senior person says, oh, I didn't know, you know, I'm a busy person, the postdoc did it. So the standard defense is the musical chair defense, yes.
John Markoff: Before we go farther too, there was another sort of early question that I wanted to ask about your two hats. You have a foot in both law and communications at UCLA, and I was wondering about the relationship between the two departments.
Mario Biagioli: So the reason that now, actually, I am an intellectual property scholar is actually the result of that work that I did in the '90s on scientific authorship. Because, so at the time I was not in law, but I became interested in the fact that, you know, that scientific authorship is really a very, very strange construct. You know, when you start thinking about it, the peculiarities of that construct stand out. So it is really not, is very different from the way that authorship is defined in copyright law, and it's also very different from what inventorship is in patent law. So I felt that I had to basically study up, you know, intellectual property to understand how strange scientific authorship was in relation to these other notions, legal notions of authorship. So for instance, you know, just an example would be scientific authorship is based on attribution, right? In a sense, you expect, if you're lucky, you get professional credit for your publications. You don't make a career out of collecting royalties on your articles. You collect citations, not royalties, right? So there are a number of very different, you know, striking differences. So short stories there. I basically learned intellectual property to understand scientific authorship better. And then I got interested in the fact that scientists, they both publish, and often increasingly so, they also patent, right? So you have this, so you can have, you know, one discovery that ends up described in a text, so it becomes a publication, but also it becomes a patent. So you have what they call the, you know, the patent-paper pairs. So anyway, as a result, I became an intellectual property scholar.
John Markoff: And in the field of communications?
Mario Biagioli: In communication, basically, the kind of work I do now, the work that you have read a bit, these things about fraud and metrics, is effectively science communication. So that's why I am in the communication department.
John Markoff: That's very interesting. So your point about scientific authorship and where it differs from other kinds of authorship, you know, is there a trajectory here? And as the, you know, as the commercialization of science, the driving force that has changed the nature of authorship, is that what, what is happening?
Mario Biagioli: The authorship is, um, so authorship I've done because Again, in my really previous life, I was a historian of science working on 16th, 17th century. So I became interested in what authorship meant there, right? Because at the time, you had manuscript publications, you had pseudonymous publications, anonymous publications. You know, it was routine for scientists to publish under other people's names so that they wouldn't get in trouble. So the scene was completely different from what we have today. And then you have the introduction of scientific journals and you begin to see the development of the system, you know, we have today. But the short story is that there is a good continuity in the fact that scientific authorship has in the past was really not connected to intellectual property. You say, let's say, you know, a case that I've written about. So Galileo published books not to make money by selling books, but he had to publish to show his patron that he was worth the money, you know, the substantial chunk of money that the Medici were paying him every year. So you publish as part of your professional requirements. But you don't publish to collect royalties at that time. That has remained throughout history. Today, when we submit things to journals, we actually give away the work to journals. Often we even give them the copyright. We transfer copyright to the journals. That's the interesting continuity, the scientific authorship, from the 16th, 17th century to the present is kind of disconnected from copyright. And there the money has entered problematically, as we see now, because the publishers have become, like, you know, Elsevier, Springer, and so on, you know, they have returns, annual returns in the 35% range, right? That's why You know, UC spends over $40 million a year in subscriptions to journals. So publications are big money, but we don't get any. So this says the money is in the publisher's hands. So the impact of money is clearly on the patent side. That's not really the same thing as authorship, but there it's obvious that, You know, the context of research has changed immensely in the last 20 years.
John Markoff: And then even more recently, how has the alternative forms of publishing to peer review that have begun to emerge and publications like PLOS, how has that affected that relationship that you've talked about?
Mario Biagioli: You know, scientists were very naive when they effectively turned their journals over to Elsevier and so on. So I think that somebody really fell asleep at the wheel. And now we have to contend with this big problem. And I think that open access is, you know, is the way to go. And I think that, you know, PLOS and similar journals, you know, they have paved the way, you know. So the problem is that this is changing not only the budgeting of libraries, but it's also turning upside down, if you want, the global economy of publishing, because say, In the past, you know, you are, say, a scientist in Ghana, you submit a piece to Science, if it gets in, it gets published and you don't pay a penny. Okay, then you instead, you will suffer when you and your colleague from Ghana try to read Science because you'll have to pay the subscription. Okay, and your institution may not have the funds to subscribe. Now instead, with open access, this picture is flipping. You know, the scientist from Ghana, unless he or she can come up with the processing fee for Science or for PLOS— you know, PLOS is the cheapest, but I think it's still $1,500 per article— so if the author cannot pay the processing fee, they're not going to publish. At the same time, they can read for free, right? So in the old system, you didn't pay for publish, but you paid for reading. And now instead, we're moving to a situation where you pay for publishing and you don't pay for reading, right? So it's changing the global economy, and we might end up in a situation where now the Global South will be able to read, but they not be able to publish. So as much as I am a great supporter of open access, you know, the budgeting issue, I mean, who is going to pay for this, is massive.
John Markoff: You know, when I worked as a science journalist at the New York Times, this created a significant problem for me as these alternatives to peer-reviewed publication began to emerge. You know, as a science journalist, you could use peer review as a metric to what was newsworthy, and then when this other situation you could see everything but you had no way of assessing its scientific value, and yet you had a competitive challenge because your news journalistic competitors were reading it too. And so there was a, you know, there was pressure to publish, and I don't think that's been sorted yet.
Mario Biagioli: Yeah, well, so there are, there are really important disciplinary differences. Like the reason why now you have, you know, so before we got to PLOS, we had arXiv, you know, so you have this preprint repositories, right, that now everybody talks about because lots of people are publishing, you know, problematic papers about COVID and so on, right? So, but initially that started among physicists and mathematicians. So the argument there was that if you work in, say, suppose you work on a particle physics experiment with 500 people, okay, if a paper gets posted on ArXiv, with 500 authors, you can make the argument that it was kind of reviewed in-house because it went through so many revisions within the group that, you know, it would not be silly to say that it was peer-reviewed, right? Instead, when you switch to, you know, some other field where you have, say, 3 scientists who just post something, That's a completely different can of worms. So the reliability of preprints, I think, is specific to fields.
John Markoff: Yeah. In the paper you wrote, Fraud by Numbers: Metrics and the New Academic Misconduct, is that still as yet unpublished? I'm not—
Mario Biagioli: So yes, it is coming out in the Los Angeles Review of Books.
John Markoff: Oh, okay.
Mario Biagioli: So I think it's going to be— I don't know if it's going to be a matter of weeks or a month or two, but it's going to be out in the Los Angeles Review of Books.
John Markoff: Yeah. Well, there was a lot in it. And one of the categories that you brought up that I hadn't heard about was this notion of post-production misconduct. Yep. I wanted to ask, could you sort of generally explain that? And then I wanted to ask you about its significance.
Mario Biagioli: So I coined that term because I was struck by comparing the old traditional fraud with what we see now, right? So the old traditional fraud, or actually the fraud as it is currently defined in the U.S., so the federal definition of misconduct in research is, you know, fabrication, falsification, authorship, right? So it is about making up data, massaging data, or making up authorship. It's clear that these three points, they all have to do with falsehood of some sort, and they concern the claims contained in the publication. Instead, what we see now, we see cases in which the publication is not touched, the publication is not manipulated. What is manipulated is the process through which the publication is published and the way and the process through which the publication is cited, is made visible. So that's why I call it post-production. So first you write and the writing part, the research and writing part may be completely kosher, right? And then you start engaging in manipulation. Okay, so that, that was, that was the, what I was trying to capture.
John Markoff: And I wonder, to, to what extent you're capturing this shift, to what extent is that made possible by the emergence of networks and computation as a publishing media? Is this a new generation of computational fraud? I know it's not exactly the same, but is it made possible by this shift to the digitization of publication?
Mario Biagioli: It definitely does. In way, you know, from Really from very banal ways to sophisticated ways, you know, in the sense that most of this stuff would not happen if we were still in the, if we still had paper submissions, right? But also, I mean, just to also to give the listeners some example of how this type of fraud is really closer, often it's close to hacking. Right? So, an example would be, there was a case in which somebody had managed to hack into a journal database and insert himself as the reviewer of certain manuscripts, and then proceeded to ask the authors to cite his work quite frequently, something about 30-40 times each per article, in exchange for a positive peer review.
John Markoff: I just, I want to stop you. I want to ask you about that because I read that and I just couldn't— it seems that's like the equivalent of the graffiti artist who signs the graffiti with their name. I mean, it's like, am I missing something?
Mario Biagioli: Yes and no, in the sense that— so it is amazing what people do, right? I mean, the fact that you hack, you know, but at the same time, it speaks to your questions in the sense that this would not have been possible in the past, right? You know, you cannot insert yourself into the review process unless there is a digital platform that you can hack into it.
John Markoff: I mean, it's such a brave new world. I also learned from reading your article about this new category called altmetrics. And my reaction to altmetrics was, I understood their rationale, but wouldn't it be easier to game altmetrics than conventional metrics? I mean, it just seems that with the emergence of bots and things like that, that—
Mario Biagioli: Altmetrics was supposed to fill the gap between the moment the article is announced or in the media or published and, you know, a few years later when you start having citations. So that was the idea. So I asked the question that you asked me to people from altmetrics. And the response, which by the way, it's in a book that Alexandra Lipman and I edited, came out a couple of months ago with MIT on Gaming the Metrics. So there is an article there on altmetrics. So they argue that because they focus on more than one indicator, so metrics are just citations, instead they look at Twitter, Reddit, blogs. They claim that it would be too complicated to manipulate so many. Anyway, so that's their argument.
John Markoff: Yeah, I think they underestimate the motivations of the attackers, but I understand their argument. So the other remarkable thing that really jumped out at me is that, you know, your point that a citation, a scientific citation has become a token of economic value in this new world. And it made me think of, I don't know if you know the history of hypertext, but Ted Nelson, who was one of the inventors of hypertext, had this model in his mind of two-way links. You know, the current internet is one-way links, but that was Nelson's original vision of the worldwide, of his World Wide Web, that you could have sort of economic value. And in fact, it's happened in your model of science.
Mario Biagioli: So, but here in the case of science, you know, say in the case of scientific publications, You know, the impact by definition requires a response from the reader. So the thing about the monetary value of citation is fairly recent, right? Because early on, when actually Eugene Garfield, you know, developed the science, you know, the field of scientometrics, you know, by the way, the first conference on scientometrics was held at CASBS, in 1974 when Robert Merton was a fellow. At that time, you see that Eugene Garfield is trying to use citations to develop maps. So, you know, so his point was, how do you map the scientific community? You know, the scientific community, you cannot simply map it by looking at institutions because you need to look at what people read and what people show to have been important to them in the form of a citation, right? So if you map who cites whom, you see who is related intellectually to whom, right? So citations were meant as a map, but then very quickly they turned into a tool of evaluation. But at the time, money was not part of the picture, it was just really measuring impact, right? And now you start seeing cases— actually, this is quite common— where people get cash bonuses for their publications, and those cash bonuses are indexed on the impact factor of the journal in which those articles are published, which means that effectively those bonuses are indexed on the expected number of citations that the article is going to receive based on the journal in which it has been published. Now, the link between citations and money is obvious. There is a mathematician here at Berkeley, Pachter, I don't remember his first name, actually has quantified how much a citation is worth. Anyway, but it has been a development initially, so there is nothing inherent in the relationship between citations and money, but it has evolved into a relationship.
John Markoff: So what has this done to the culture of scientific research, and how much is this, to what extent is it mainstream as opposed to outlier behavior in terms of the notion of gaming?
Mario Biagioli: So there are certain forms of gaming that are considered completely legitimate. So for instance, there is an article by Daniele Farnelli, who is now at LSE, and he has argued that the basic form of gaming is multi-authorship. So, and the argument actually is pretty straightforward, that is, if I publish an article by myself, the visibility of that article is going to depend on the visibility of my name and the visibility of the journal in which it comes in. But say, just to say, you know, let's say that there are 500 people who know me, okay? So those 500 people may become potential readers of the piece if they see it, okay? Because they know who they are and they might pick it up, okay? But if we have, say, 10 authors, and suppose that those 10 authors are known by fewer people, say 200 people each, still there are 2,000 people who will be able to recognize one of the authors on the byline. So the more authors you add, the more you increase the visibility of the article, which may translate into reading and citations. So the analogy would be this: the byline of the article is a billboard. From there on, you get into, how to put it, more creative forms of gaming than, you know, obviously trespassing into pretty criminal stuff. Yes.
John Markoff: Yeah.
Mario Biagioli: So, but what's important is to see the continuity, you know, that even at the basic level of completely acceptable multi-authorship, there is a direct metrics component behind it. Yeah.
John Markoff: How much of this is an artifact of the decline of what was once referred to as the ivory tower? You know, the academy was once more isolated from the commercial world, And increasingly, particularly institutions like Stanford, the line is so blurred. The joke about Stanford computer science students is they're just there to do the toe touch on their way to doing their startup. I think that really does describe that part of the engineering culture.
Mario Biagioli: One thing, so this is a little bit, it's somewhat cynical take on things. But so just to give you an example that might give us some material to discuss this issue. So recently, a new trend of misconduct has emerged that has to do with the fact that people make up co-authors, which really does not seem to make any economic sense, right? Because why would you want to make up co-authors because that would kind of dilute your authorship credit, right? So why do you do that? Well, it turns out that typically these co-authors that are made up, they are attributed, they are given fantastic institutional affiliations, you know, Stanford, Caltech, Cambridge, and so on. So obviously you see that this is an issue of branding, but the question is, Why it's pretty interesting that now people are using literally the brand of the university, right, the brand of the university to attract attention to the publication. So in the past, if you were a member of a discipline, you were likely to know a good chunk of the members of the discipline. Now you don't because the fields are so huge. I mean, we're talking about 3 million articles published every year. Right? So this is a completely different game, you know, than say 30 years ago. So people don't know people anymore. In the past, say 20 years ago, people would have put their famous PhD advisor on the byline as a way to catch the attention of journals and editors. Now most journal editors are not going to know a lot of people in the field because the field is too big, but they will still be able to notice Caltech, Stanford, Cambridge. And to me, that is related to, if you want, the globalization of the community, and if you want, the end of the little ivory tower.
John Markoff: You know, the emergence of machine learning and the deployment of algorithms widely in society You know, we're talking about algorithmic legal sentencing now, for example, and that of course brings in the question of bias. But when you look at it from your perspective, now there you can turn anything into a metric. They're, they're part of the very fabric of our culture all of a sudden. Have you, have you looked at the emergence of algorithms as a—
Mario Biagioli: Yeah, I, you know, in law, you know, there have been those, um, the discussion has been going on for a while Because it started also not only sentencing algorithms, but, you know, various methods for jury selection, right? You know, so there have been a number of things that have been turned algorithmically. You know, the continuities are remarkable, right? Because one of the arguments about metrics of academic evaluation is that they are unbiased, right? Peer review is biased because, you know, the rules are not clear about how the evaluation happens. People know each other, blah, blah. So it's presented as completely unscientific, right? You know, there are many conflicts of interest, blah, blah. Instead, metrics is objective, right? I mean, until you read Gaming Metrics. So, and that's the same argument that you see in sentencing algorithms, right? You know, it's done in the name of what Peter Gallison and and Rainie Dustin called mechanical objectivity, right? It's the triumph of mechanical objectivity, except that the biases are in the code. And those biases often are not easily accessible because they don't let you see the code, right? So in a sense, so it's not like, you know, you don't read the fine print. I mean, they don't even show you the fine print. So, You know, I think they're dangerous. I certainly understand why people have the desire to automate because I think that probably, you know, computers are better than a lot of unethical judges or biased judges. But I think that the fact that, you know, we cannot have access, often because of intellectual property, we cannot have access to the code. So they're turn into black boxes, and, and that's, that's, uh, that's a huge problem.
John Markoff: You place this within the context of, uh, Goodhart's Law, the economist who pointed out that if you highlight a feature of an economy, it, it will be gamed. Are there, are there solutions to Goodhart's Law?
Mario Biagioli: Nope, nope. Uh, I, I don't— I, I think that, uh, Goodhart, uh, got it, you know, and also it's interesting that it's not, you know, that a number of people have come up with the same law at different times, including Robert Merton. In a conversation, in a letter from Robert Merton to Eugene Garfield in the wake of the CASBS conference in 1974, Merton tells Garfield, you know, this work you are doing on citations is fantastic, but you understand that this is going to trigger gold displacement. Meaning that as soon as the scientists understand that citations will count, they will figure out a way to maximize them, right? So no, I don't think— I do believe that one of the fellows at CASBS this year, Brian Arthur, has a great article titled "All Systems Will Be Gamed," and I agree with him. So actually, I'm really sorry to, in a sense, to get more depressing toward the end of our conversation. But so, you know, I feel that, I think the writing is on the wall in the sense the matrix is not going to go away, right? For so many reasons that, you know, at this point we don't have the time to go in, but I take it for granted that, you know, matrix is not going to go away. And that the instances and modalities of misconduct are only going to expand because the more you expand the matrix and you differentiate it, the more gaming there will be, right? So this is an expanding market. So I have nothing, you know, really nothing cheerful to say about that. There are, however, certain— I mean, to give you an example of what I mean by radical change, so there is a scholar at Indiana, at the Information School, Johan Bollen. So he has, you know, he has written a little editorial in Nature maybe a year ago talking about the insanity of the funding system in the US, right? This is marginally connected to misconduct, but just to give you a sense of what the kind of changes we might need to entertain to really change things for the better. So, you know, there is a big literature about how the grant application system is really too conservative because in order to get money, you just need to put forward projects that are innovative but not too much, otherwise they're going to be too risky and you're not going to get the money. Sometimes you need to write the application almost after you have conducted the work because you have to show that you really know exactly what you're going to do. So there is a very strong conservative bias built in the system. Right? So Bolden basically said, look, you know, forget about all this stuff. He calculated the amount of hours that academics spend either in writing or reviewing grant proposals, which is like insane, right? You know, we're talking about misallocation of resources. So his proposal is each scientist gets a fixed— this is like a guaranteed income. Each scientist gets a chunk of money. Okay, everybody gets the same, but each scientist also has to give away, say, 50% of what he or she gets, and has to give it away to other scientists whom he or she thinks are doing interesting work. Okay, so this would basically save all the time that we waste on grants, and it would still create— there is a peer review effect, that is the peer review of people who actually will have to give away money, and they're not going to be happy to do that, but they will have to give it to people that they think are particularly good, right? So something like this, I have no idea whether this will ever become reality, but you see that this basically just cuts through a huge set of problems. So when we come to misconduct, You know, an example would be a lot of misconduct has to do with deadlines, right? People, you know, they're coming up for promotion, for review, they don't have enough articles, so they fake it or they buy articles, or, you know, or they fake citations to show the dean that they're really good and they deserve the job. So there is so many, so many things in misconduct do not not just have to do with evaluation but have to do with the timing of the evaluation, with the deadline. So changing deadlines or trying to get rid of deadlines probably would help a lot. So the short story is that unfortunately I don't think the metrics is going to go away, but any improvement I think it needs to be done by really changing things substantially, because otherwise metrics will— I mean, one of the fascinating things about metrics is that if you say, oh, this metric doesn't work, people are going to say, oh, you're right, we'll fix it. So in a sense, metrics is amazing, the discourse of metrics is amazing at incorporating criticism by producing another metric, right? And then you're going to come up with some problems about the metrics and they will issue a next release, right? So, so, so, Metrics is, if you want, a monster that incorporates criticism very easily.
John Markoff: So, so, thank you, Mario. This is really good, really fun.
Mario Biagioli: It was really fun to talk to you guys.
Narrator: That was John Markoff in conversation with UCLA Distinguished Professor Mario Biagioli. If you're interested in the history of scientific publishing or the current state of academic misconduct, We've linked to Barrio's work in the episode notes. We've also got a link to the paper he mentioned by CASBS fellow Brian Arthur called "All Systems Will Be Gamed." It's a great read too. We've got more great interviews coming to the show and more episodes of the ongoing CASBS webcast series titled Social Science for a World in Crisis. So make sure you're subscribed in your podcast app of choice. You don't want to miss these. And as always, be sure to visit our website where you can register for upcoming events, subscribe to the fantastic CASBS newsletter, and dig through the rich history of the center, head over to casbs.stanford.edu, or you can find us on Twitter @casbsstanford. Until next time, from everyone here at CASBS, thanks for listening.