Author Archives: Larry E Hibbler

Appeals Court Rules in Internet Archive Controlled Digital Lending Case

The Second Circuit Court of Appeals has ruled against the Internet Archive (IA) and its Controlled Digital Lending program in Hachette Book Group, Inc. v. Internet Archive, holding that the program was not fair use, with every fair use factor supporting Hachette et al. (the Publishers). An appeal is possible, but the future of doing CDL at scale under the fair use doctrine is bleak.

How We Got Here

Controlled Digital Lending (CDL) is the digitization by libraries of lawfully acquired books, and the lending of those copies via technical measures that prevent copying digital files, while ensuring that there are never more total copies (physical and digital combined) in circulation than the number of physical copies owned.

This case started back in 2020. Internet Archive had a CDL program for years, with their own books and with partners. During the COVID emergency, in response to library closures, they started a “National Emergency Library” where they removed the cap on the number of digital copies of books they circulated. At this point, a number of large publishers sued IA for copyright infringement.

The Internet Archive could not deny they made copies of books, but claimed CDL was allowed under fair use. The district court that heard the case ruled against IA on every major point. Internet Archive appealed to the Second Circuit Court of Appeals, and the case was heard in late June. Importantly, between the District Court opinion and the Appeals Court hearing, the Supreme Court decided Andy Warhol Foundation for the Visual Arts, Inc. v. Goldsmith, interpreting fair use in a way that put much weight on the ability for a copy to substitute for an original work as being against fair use.

Second Circuit Ruling

The three-judge Appeals Court ruled against the Internet Archive on all four fair use factors.

Purpose and Character of Use

This section examines two important questions. One, was the use of the copy transformative? And two, was the use commercial?

The issue of transformative use really decided the case. Internet Archive pointed to cases that said a use might be transformative if it improved efficiency in delivering the content, as CDL does. CDL also allows people to link directly to the book as a source of information. However, the Court said that the use was “meant to–and does–substitute for the original Works.” (p.24). This substitution is the antithesis of transformativeness. The Court then tried to articulate the difference between a transformative work and a derivative work. A derivative work is defined by statute as “…work based upon one or more preexisting works…” with many examples and the catch-all “any other form in which a work may be recast, transformed, or adapted.” E-books are not listed in the definition, but the Court stated that changing the medium of a work is by itself a derivative use. (p. 25). The Court did have to distinguish a couple of other cases, most notably the Sony copyright case from the 1980’s, which allowed people to use VCRs (technically Betamax) to record television programing to watch it later, in such a way cabins the opinion to a particular time period in broadcasting and technology that raises questions as to whether the court would rule the same way today.

On a side note, in considering the owned-to-loaded ratio of CDL as a basis for the use being transformative, the court said “IA does not perform the traditional functions of a library; it prepares derivatives of Publishers’ Works and delivers those derivatives to its users in full.” (p. 31). I do not think this was a significant part of the Court’s analysis, but it does point to an issue that was more present in the initial stages of the case – is the Internet Archive a library? I hope courts take that question beyond “lending of print books” if called upon to make that specific choice in the future.

The use was found by the Second Circuit to be non-commercial, with the program only providing attenuated financial benefit to IA, but that did not swing this factor in favor of the Internet Archive.

Nature of the Work

While Internet Archive argued that the copying of non-fiction books should at least be neutral in the balance of fair use, as facts are not protected by copyright, the Court still found that overall, non-fiction books have the type of creativity that copyright protects. (p. 41). This is one spot where a specific book-by-book analysis could have made a difference, instead of looking at the aggregate, but this is typically not a decisive factor and probably saved months of time in reaching an opinion that would have come out much the same.

Amount and Substantiality of Portion Used

This factor is tricky in that courts need to emphasize that there is no strict quantitative rule that a certain amount of copying is always fair use, or that copying an entire work is never fair use. The Court here pointed out that IA copied entire works, and also made those works available to the public in their entirety. (p.42). The Court looked back at the first factor here to say that since the copying was not for a transformative purpose, but to substitute for the work, the copying was too much. This is not usually a decisive factor, and the court here seemed most interested in highlighting why this copying was different than in its Google books case, where Google copied entire books, but only made snippets available to the public.

Effect of Use upon Potential Market for Work

The court held that the Publishers did not have to prove harm with evidence, but that the Internet Archive had to meet a near impossible standard – proving that CDL does not harm the market for books. (p.45). The IA was not helped by the fact that most of the Publishers did not provide monthly sales data. Both IA and the Publishers had expert witnesses, but the Publishers’ expert carried the day in critiquing IA, as the Court was not persuaded by the analyses from IA looking at book sales during and after the National Emergency Library. (pp. 49-52). The court put stock in the idea that if a copy is available for free, the market for the paid version will be affected negatively, and also noted a couple of phrases from IA’s promotional material strongly suggest it was appealing to other libraries to utilize IA’s CDL program so as not to have to pay for ebooks. It is quite possible the Publishers could have found examples of libraries doing exactly that – libraries deciding not to buy eBooks and instead putting IA’s Open Library books into their catalog for patrons – and used that as evidence. However, the Publishers may have been more worried about opening the door to the idea that the copyright holder should have to provide evidence of market harm, which is a standard they would really want to avoid.

The Court did nominally consider the benefit the public derived from IA’s CDL to counterbalance to market harm. Here it really discounted the value of providing access to knowledge to the public, and focused on the argument that providing copies of books would disincentivize authors, and disincentivized authors would not create new works, harming the public. This Court cited the Supreme Court on this exact point. Perhaps if the Supreme Court was able to monetize its opinions instead of them being in the public domain, they would take more cases!

The Future of CDL

The most immediate question is what this means for CDL of books that are not available electronically from a publisher. On the one hand, the court says that “[W]e conclude that the challenged practices–IA’s lending of its “own” digital books that are commercially available for sale or license in any electronic text format,’ . . . are not fair use.” (p.20). On the other hand, it defines the market as the market for “the Works in general, without regard to format.” (p. 46) and makes it clear it thinks that creating a digital copy of a book was making a derivative of the original, and not a transformative use. Put together, it is hard to see how any book could be digitized and used for CDL without permission.

That permission may be the key to future CDL. There is nothing saying that a publisher could not allow libraries to copy and lend books with a one-to-one owned-to-loaned ratio, or any ratio. It might not appeal to Hachette, but there could be publishers who do not have the bandwidth to have their own digitization program, nor want to work with platforms like OverDrive, and choose to have library-based CDL, perhaps with some money changing hands.

As for IA’s program, IA is “reviewing the court’s opinion.” They could appeal to have the case re-heard by the entire Second Circuit, though this is rare. Longer term, IA does have an option to appeal to the Supreme Court, but the Supreme Court does not have to hear it. There is a hint of a circuit split on the issue of proving a lack of market harm, as the D.C. Circuit ruled in a different direction in the similar case ASTM vs. PRO. But, the Second Circuit here relied heavily on the very recent Warhol case, which the Supreme Court decided in 2023. I think that makes a grant of cert unlikely. Even if the Supreme Court would be interested in addressing the market harm issue, Internet Archive will have to consider if that would be enough to change the final result. Proponents of CDL may need to focus on legislative changes to copyright law, especially to libraries’ copyright exemptions in 17 USC 108, to deliver on the benefits of CDL.

Silver lining for non-profits, and maybe Creative Commons

As mentioned above, there was some good news from the decision about non-commercial use. At the district court level, Internet Archive’s CDL program was deemed a commercial use, as IA had a donate button on the same page as the book. There was also language about IA gaining reputational benefits from the program. This was all despite IA being a non-profit organization. The Second Circuit found that this standard would be very damaging to non-profits in general, and would likely prevent them from ever utilizing fair use. (pp. 37-38). Though not mentioned in the case, this language might also provide a little more clarity around the Creative Commons BY-NC license, where commercial is not defined.

What does this case mean for AI?

This case will also interest people looking at the New York Times case against OpenAI and Microsoft, currently in a lower court within the Second Circuit. Many AI companies claim that scraping the internet and using copyrighted works in training a model is fair use, and cite the Google Books case. I would not say this case is likely to swing the result of AI training cases, but two bits stand out.

First, the court positively cites language from several cases that says a copy is not transformative if it just “repackages” or “republishes” a work. Query whether training a Large Language Model is just repackaging content into some set of probabilities and relationships between words. Second, in discussing the third fair use factor, it reiterates its decision in another case to limit the factor to the amount of the material “made available to the public.” (p. 42). Assuming the public cannot extract copyrighted content back out of a model (which is a point of debate, and might depend on the content), it seems like the AI companies will do well on that factor.

Apologizes for citation style to any Bluebook devotees!

Controlled Digital Lending, Round 2

On June 28th, the Second Circuit Court of Appeals in New York heard oral arguments in the Controlled Digital Lending case Hachette v. Internet Archive. The judges probed both sides to find weaknesses in their arguments, positing a number of hypotheticals to push how far each side’s theory of the case went. The hearing lasted almost ninety minutes, well more than the scheduled time.

Listening to the hearing, I came away feeling that the Internet Archive’s attorney was pushed a little harder than the Publishers’. Given that the Internet Archive was appealing a negative ruling where they lost on both the nature of the use and the effect on the market fair use factors, I am not surprised.

A few takeaways from the hearing:

We did not get much argument on the issue of “Was the Internet Archive’s use a commercial use or not?” Even if this does not turn out to be determinative in this case, the issue could be very consequential for other non-profit organizations like the Wikimedia Foundation, which hosts Wikipedia.

I could not tell if judges were being intentionally vague in making comparisons of CDL to making copies for interlibrary loan, or if they did not fully understand the difference in making copies for interlibrary loan via 17 U.S.C. 108 and lending books, including by interlibrary loan, under 17 U.S.C. 109. I am sure the judge’s law clerks will get very acquainted with those sections while the case is being decided.

ASTM v. PRO, a case from the DC circuit about putting things like building codes that have been incorporated into law by reference, was mentioned as a possible analogy to show the lack of effect on a market for copyrighted material. This case in general is one of the best for Internet Archive, both in looking at lack of market harm, and at how copies can be transformative without adding to a work. The downside is that it is not binding precedent in the Second Circuit.

Keep in mind that oral argument is only part of the case. The judges will also consider briefs filed by both parties, as well as a number of amicus briefs filed by outside groups and scholars. Given the amount of briefing and argument in the case, I would not expect a ruling until fall at the earliest, and possibly not until the first half of next year.

Fall 2024 Electronic Dissertation Workshops

Writing a dissertation takes a lot of work. Submitting a dissertation does not have to! The Libraries’ eTD@BC workshops for graduate students will prepare you to submit thesis or dissertation. Planning now can save so much time later, right at the end of the process when time becomes really valuable. This fall, there will be three sessions, one in-person and two virtual, all covering the same material.

Dates:
Tuesday, October 8, 6:30 – 7:15 pm, on Zoom.
Thursday, October 10, noon – 12:45 pm, O’Neill Library 307.
Thursday, October 17, noon – 12:45, on Zoom.

To register, go to https://libcal.bc.edu/calendar/workshops. Upon registration for an online workshop, you will receive a confirmation email with the Zoom link.

Topics to be covered in this workshop include:

The submission website, including a walk-through of the submission process
Important decisions and issues, such as eScholarship@BC, embargoes, copyright, etc.
How to ensure that a published eTD can be discovered and accessed by others
Where to get additional help

Graduate students can contact etd-support@bc.edu with any questions about the workshops. There will be additional workshops in the spring.

Open Access Publishing Fund opens soon!

Applications for grants from Boston College’s Open Access Publishing Fund will open on June 3rd! Faculty, students and staff are encouraged to apply. Open access can be expensive, so the fund assists authors in making their new work available via open access when they do not have other grant funding. Applications can be submitted before an article is accepted, but the intended journal needs to be listed on the application.

Last fiscal year, the fund awarded more than $30,000 of grants for twenty-one publications, including an open access monograph. While open access may be most prevalent in the natural sciences, many different disciplines have taken advantage of the fund. Please contact Elliott Hibbler if you have any questions.

A pie chart showing a fairly even distribution of awards between Biology, Communication, Computer Science, Engineering, Envi Studies, Fine Arts, LSEHD, MCAS Core, Psych and Neuro, and School of Social Work. Biology, LSEHD, Fine Arts, and the School of Social Work have the most.

OSTP Federal Research Funding update

The White House Office of Science and Technology Policy (OSTP) has received funding from Congress to continue its implementation of the Nelson Memo. This memo requires any federal agency that awards research grants to implement a policy requiring immediate public access to publications resulting from that research, as well as access to data and the use of persistent digital identifiers in article metadata.

During the lengthy Federal appropriations process, the House Appropriations Committee released a bill that specifically defunded any attempt to implement the memo. No individual or lobbying group ever came forward to take any credit for trying to kill the OSTP memo in the budget, nor was there much explanation of why it might have been included.

The final appropriation bill (technically the explanatory statement accompanying the bill) only included a requirement that OSTP produce a financial analysis of the impact of the memo, “including the policy’s anticipated impact on Federal research investments, research integrity, and the peer review process,” within 100 days of the bill passing. In other positive news, this was the only requirement. There is no trigger stopping development of policy depending on what the report says. This likely means that after the report, there would be a round of Congressional hearings before more action is taken. Being an election year, there may not be enough time for a truly adverse legislative action. Overall, this means plans will progress, and there should be some good reading on the state of scholarly publishing sometime in mid-June!