D should follow: "Gentoo Linux Begins Codeberg Migration In Moving Away From GitHub, Avoiding Copilot"
Indraj Gandham
newsgroups at indraj.net
Thu Feb 19 21:23:11 UTC 2026
There are two primary questions when it comes to LLMs and copyright:
(1) Can the training of models on copyrighted works constitute infringement?
(2) Can the output of a model constitute infringement?
The answer to both of these questions cannot always be "no", because it
would enable the development of models specifically with the intent to
launder copyrighted works.
Even if a court rules that (1) is fair use, considering that it has been
shown that LLMs can reproduce portions of copyrighted works verbatim, I
would speculate that the ordinary threshold test will apply in (2).
The problem is that it is not at all obvious whether a given output
meets the threshold of originality. A simple textual comparison between
the output and training data is not sufficient to show the absence of
infringement as non-literal elements can be copied. The test applied by
courts in such cases is known as Abstraction-Filtration-Comparison (AFC).
To help mitigate this risk, I would suggest the following:
(a) Reject PRs with the "AI Generated" label if the contribution meets
the threshold of originality; and
(b) Require all contributors to assert that they have the appropriate
legal rights to make the copyright assignment to DLF.
To determine whether a contribution meets the threshold, you can use the
guidelines set out by the FSF:
https://www.gnu.org/prep/maintain/maintain.html#Legally-Significant
The purpose of (b) is to shift liability from DLF to the contributor
should any concerns regarding provenance arise.
Hope to see you all at BeerConf!
Indraj
More information about the Digitalmars-d
mailing list