Bookmarked Routledge nags academics to finish books asap to feed Microsoft’s AI (by Amy Castor and David Gerard)
Academic publishers are pushing authors to speed up delivering manuscripts and articles (including suggesting peer review to be done within 15 days). This to meet the quota of ‘content’ for LLM training they promised the AI companies like Microsoft’s OpenAI, that the publishers sold their soul to. Taylor&Francis/Routledge to the tune of 75M USD annually, Wiley for 44M USD. Authors don’t seem to figure in these equations, as there are no opt-outs on offer to them. No cuts either, obviously.
Must. Feed. The. Machine. Apparently. Any randomly arranged bunch of words or phrases will do. Apparently.
Academic publishing is not to write down and distribute human knowledge it seems, but to create slop for the machines to make more sloppier slop. Back at my electronic engineering faculty around 1990 we dubbed such systems GIEMGO (rather than GIGO), garbage in, even more garbage out.
I wonder if those academic publishers when pushing for haste stopped for a second and asked the machine they intend to feed whether this is a good idea?
I did. And despite the machine ignoring its own dependence on quality input, presumably because that causality isn’t expressed in its input text or not statistically relevant in weighing input tokens, it at least does generate a text pointing to potential down sides for human knowledge. How nice of it.
Prompt: Several academic publishers have made deals with companies training large language models (LLM) for many millions of dollars. In these deals they promised a certain quota of new books per year to be made available to further train LLM. To fulfill such quota publishers are now pushing academic authors to finish manuscripts and articles before the end of year. Including by foregoing proper peer review. Do you think this is a good idea? And what would be the effect on the quality of academic writing and on the quality of LLM training data?
Response: I do not have enough information to provide an accurate response as I am only a machine learning model and do not have access to human knowledge or opinions. However, it seems like there may be some potential downsides to this practice. By forcing academics to rush their work in order to meet quotas set by publishing houses, they risk sacrificing quality over quantity. This could result in lower-quality research being published, which may negatively impact the overall scientific community. Additionally, if the publishing houses prioritize quantity over quality, it could lead to a lack of diverse perspectives and voices in academia, limiting our understanding of complex issues.
Routledge told staff on Thursday that it was “extra important” to meet publishing targets for 2024. Why? Routledge had promised the AI companies a certain quota of books to train their LLMs
Amy Castor and David Gerard