Author Archives: Gabriel M Feldstein

Making the Most of Open Data

As part of Open Access Week, we want to recognize the critical role of open data on research and innovation. While the importance of sharing data is well-acknowledged, reusing open data and incorporating it into your research can be equally critical.

Why reuse data?

1. Efficiency and Time-Saving – Open data can accelerate the research process

2. Replicability and Validation – Reusing open data enables the replication of studies and validation research findings, which helps foster transparency and trust in scholarly work.

3. Ethics and Reducing Oversampling Burden – Certain populations, particularly marginalized and vulnerable populations, are sometimes overly sampled in research and reusing open data allows us to reduce the burden of oversampling. 

4. Cross-Disciplinary Insights – Researchers can blend data from different disciplines, encouraging cross-disciplinary collaboration.

How to approach data reuse?

1. Assess Quality & Compatibility – Check for completeness, biases, representation. Consider if the scope, variables, and collection methods align with your research objectives (e.g. if you were going to collect your own data, would you be doing it in the same way?)

2. Review Documentation – Part of quality assessment is reviewing metadata and understanding the collection methods and any cleaning processes that the data underwent. Need to examine licensing and ethical concerns around privacy and consent.  

3. Data Integration – Open data might not always be the best fit, but it can be thought of as supplementary or contextual data that will make your primary research data more robust.

Journal Policies and Artificial Intelligence

As developments continue in the world of generative artificial intelligence, more and more academic publishers are grappling with the reality of easily accessible AI tools, such as ChatGPT. Rather than take a strict approach to the use of AI, it is possible that a more realistic perspective is one that understands the potential value of generative AI for new students coming to a subject, but it should be clear that there are some problems with the service, and that not all of the information is accurate – in particular, many generative AI products have been known to actually generate fake or incorrect citations when prompted.

As a result of the popularity of some of the new AI tools – and with the understanding that access to them is hard to limit or manage, many publishers are coming up with explicit policies around the use AI in scholarship. See below on some examples of the types of language that can be found across the publishing spectrum. If you are the editor of a journal here at BC and are curious about the best practices about forming this type of policy, be sure to reach out to the Scholarly Communications team here at the Boston College Libraries.

Nature

Nature is a well-known and renowned international science and technology journal – their policy clearly state the relationship that authors need to have for the content that they find using large language models (LLMs) including generative artificial intelligence models, such as ChatGPT. The following two paragraphs are taken from an editorial published in January, when Nature added this language to author guidelines.

“First, no LLM tool will be accepted as a credited author on a research paper. That is because any attribution of authorship carries with it accountability for the work, and AI tools cannot take such responsibility.

Second, researchers using LLM tools should document this use in the methods or acknowledgements sections. If a paper does not include these sections, the introduction or another appropriate section can be used to document the use of the LLM.”

JAMA Network

The Journal of the American Medical Association published an editorial in January of 2023 speaking to the use of non-human authors:

Author Responsibilities

Nonhuman artificial intelligence, language models, machine learning, or similar technologies do not qualify for authorship.

If these models or tools are used to create content or assist with writing or manuscript preparation, authors must take responsibility for the integrity of the content generated by these tools. Authors should report the use of artificial intelligence, language models, machine learning, or similar technologies to create content or assist with writing or editing of manuscripts in the Acknowledgment section or the Methods section if this is part of formal research design or methods.

This should include a description of the content that was created or edited and the name of the language model or tool, version and extension numbers, and manufacturer. (Note: this does not include basic tools for checking grammar, spelling, references, etc.)

Reproduced and Re-created Material

The submission and publication of content created by artificial intelligence, language models, machine learning, or similar technologies is discouraged, unless part of formal research design or methods, and is not permitted without clear description of the content that was created and the name of the model or tool, version and extension numbers, and manufacturer. Authors must take responsibility for the integrity of the content generated by these models and tools.

Image Integrity

The submission and publication of images created by artificial intelligence, machine learning tools, or similar technologies is discouraged, unless part of formal research design or methods, and is not permitted without clear description of the content that was created and the name of the model or tool, version and extension numbers, and manufacturer. Authors must take responsibility for the integrity of the content generated by these models and tools.

WAME

Later in 2023, the World Association of Medical Editors also established a set of recommendations around using generative AI – specifically referring to Chatbots – such as ChatGPT.

Chatbots are activated by a plain-language instruction, or “prompt,” provided by the user. They generate responses using statistical and probability-based language models. (5) This output has some characteristic properties. It is usually linguistically accurate and fluent but, to date, it is often compromised in various ways. For example, chatbot output currently carries the risk of including biases, distortions, irrelevancies, misrepresentations, and plagiarism – many of which are caused by the algorithms governing its generation and heavily dependent on the contents of the materials used in its training. Consequently, there are concerns about the effects of chatbots on knowledge creation and dissemination – including their potential to spread and amplify mis- and disinformation (6) – and their broader impact on jobs and the economy, as well as the health of individuals and populations. New legal issues have also arisen in connection with chatbots and generative AI. (7)

Published in May of 2023, this statement seeks to clarify the recommendation beyond a similar January announcement, and while publishers and editors are starting to figure out how AI fits into the larger picture of scholarly publishing, remaining vigilant about any changes in technology will be a key piece for journal managers and editors.

What does the future hold?

Ultimately with improved technology – whether that was the ability to copy printed pages, access content digitally – there is always a period of fear and excitement as the models for publication and accessibility may be threatened by a new model; no matter the pitfalls of the new generative AI technology – or the successes – it will be important for journal editors and scholars submitting their work to understand the policies around using generative AI and whether or not consulting it may be accepted for a given publication.

OASPA Conference

The Open Access Scholarly Publication Association recently held a conference inviting Scholarly Communications librarians from all over the world to come together virtually and discuss the current trends and new ideas in Open Access.

Articles Processing Charges (APCs)

While the movement for broader Open Access publishing, and knowledge and interest continues to increase among faculty and scholars in all disciplines, not all Open Access is equally free. Transformative journals may provide some of their content open access – but still require a subscription for all of it. Additionally, the increasing popularity of “Read and Publish” agreements – while generally beneficial for universities and libraries, providing a wide breadth of access for scholars at a given institution or set of institutions, do have some drawbacks for the scholarly ecosystem as a whole.

A good deal of the poster sessions therefore focused on models of Open Access publication that do not require Article Processing Charges at the point of submission – this is known, generally, as Diamond Open Access. In order to be considered Diamond Open Access, publication schedules cannot contain embargoes and the cost of submission and of accessing the content must be zero. These funding models are generally sponsored by universities, research institutions, and libraries themselves – as more and more institutions of higher education are recognizing that investing in consortial open access projects help budgets in the longer run, as they continue to support the free access and publication of research, which in turn lowers – or outright eliminates – subscription and APC fees.

Open Access Community Investment Program

“The LYRASIS Open Access Community Investment Program (OACIP) provides a community-driven framework that enables multiple stakeholders – including academic and public libraries, academic departments, institutions, museums, and funding agencies – to evaluate and collectively fund Diamond Open Access (OA) journals.” – taken from the OACIP website

Lyrasis presented a relatively new program at OASPA; OACIP. Projects like these can provide the framework and model for Diamond OA so that institutions do not need to reinvent the wheel when it comes to figuring out their own equitable processes for publication. Additionally, having a large consortia of libraries participating and contributing to the discourse, credibility, and viability of Diamond OA projects make them all the more tenable in the eyes of researchers and faculty members trying to balance the pressures of publishing equitably.

Scottish University Open Access Press

Another example of a consortial approach to Diamond OA is the Scottish University Open Access Press. The goal of this project is to create a press managed and published by and for member university libraries. While the initial investments and staff times may demand higher education professionals working on projects that might not benefit their university in the short term in terms of semester to semester collections, growth of projects like these promises to significantly reduce the dependence on high-cost subscriptions – as an Open Access Press would ensure must more affordable access to materials that are already being generated by those same universities to begin with.

Consecuencias Accepted into the Modern Language Association’s (MLA) International Bibliography

For the past four years, Consecuencias has been publishing issues and articles covering different aspects of Spain’s cultural production via Boston College Libraries’ Open Access Journal portfolio. Articles in English and Spanish come together to create an interdisciplinary study of Spanish cultural artifacts, historical movements, and thought leaders. In the past year, since the beginning of the 2022 fall academic term, the relatively new journal has been downloaded 2,581 times in 68 different countries across the world in over 300 cities.

Recently, Consecuencias is also celebrating indexation in the Modern Language Association’s International Bibliography – a renowned database for scholars in the humanities, which all but assures that more researchers, authors, and students will be able to access Consecuencias publications. Not only does indexation in MLA’s International Bibliography signify that it will be reachable by those who use that database to conduct searches, indexation is also an excellent way to increase a journal’s performance around search engine optimization, as well as providing prestige and implicit accreditation, as scholars considering submitting work to an (especially open access) journal will often check where it is indexed to get a sense of its legitimacy and reach in a given discipline.

For more insights on readership statistics, click on the image below to view a data visualization.

MIT Direct to Open 

In 2021, MIT Press established the Direct to Open (D2O) model as a sustainable framework that engages libraries as a way to circumvent the profit motive for academic publishing. As publishers continue to raise the prices of APCs – to the point of driving some of their own editorial staffs away – universities and libraries are trying to find more sustainable, cost efficient ways to access scholarly materials so that collection budgets are not consistently weighed down by yearly subscription, or one time purchases of articles. Under the D2O model, if enough libraries agree to purchase a package of books published that year at a certain price, all of the books in the package will be published Open Access. As the count of member libraries increases, the price for each library in turn drops. Direct to Open is also committed to equity, as the fees for participation are determined based on the library size, type, and collection budget. Boston College Libraries, as part of its support of sustainable Open Access Publishing, has participated in the program since its inception.

The increased participation of libraries in Direct to Open allows for MIT Press’ full list of scholarly monographs to be published directly to the MIT Press Direct platform. As the participation has increased, the international scale of D2O continues to grow, as the Press reached agreements with the Big Ten Academic Alliance, the Konsortium der sächsischen Hochschulbibliotheken, the Council of Australian University Libraries, Jisc, SCELC, Lyrasis, and more. 

The list of books which will be opened is available on the MIT Press website. While libraries participating can know their payment goes to making what would have been paywalled content open, there is also a direct benefit, as participating libraries have term access to backlist or archives materials, discounts on high-quality works that are also available for print purchase, and opens access to new MIT Press scholarly monographs and edited collections.