AI Preferences Working Group Materials

AIPREF Working Group Interim Meeting - 23 June 2025

Notetaker: Felix Rada

Vocabulary draft - issue discussion, draft review

Relationship between categories

Farzaneh: Is it possible to declare “tdm=no, AI=yes”?

Paul: Yes, being lower in the hierarchy does not mean that a specific instruction cannot overrule the more general category instruction.

** Figure visualizing the hierarchy between categories is unclear

Stéphane: Hierarchy should only exist if it serves a purpose.

Suresh: From declaring party perspective, having a catch-all category is helpful, because it allows you to opt-out of everything and only allow uses that you fully understand and endorse.

Paul: If there is no overarching category broadly equivalent to TDM as defined in EU law, EU copyright holders will not be able to use the vocabulary and may resort to another one, meaning that rights holders

Martin: There is demand for the ability to declare a broad preference against all uses and then opting in to selected, specific uses. The EU definition of TDM is helpful, though we don’t want to tie the vocabulary to any given law.

Suresh: Having a hierarchy enables the same signal to be interpreted in the same way in different legal regimes, Is there/should there be overlap between the search and inference categories?

Farzaneh, Paul: importance of clarifying that preference signals do not confer rights, vocabulary does not assume that rights holders are making preference declarations

Farzaneh: preference signals should consider the end user, this raises problems particularly with inference/RAG category

Paul: To reduce potential impact on end users, editors consider narrowing the inference and search categories

M Thomson: AI inference and search categories have a risk of overlapping, search includes AI at various steps

Felix: Instead of AI inference category, describe how that use-case differs from search and how the information is presented to the user.

Leonard: This category is different from training, because the asset is not used at the training stage, but at the inference stage, so it should be separate from AI training.

Paul: Demand for “search” and “inference” categories comes up consistently, we just need to get the names and definitions right.

Timid Robot: Focus on “what” of the categories, not the “how”, to avoid overlap between them

Felix: Vocabulary for behavior of an AI system, may want to distinguish from direct user input and inference distinguish between human and AI system behavior

Paul: Inference category needs to be narrowed down based on the discussion. To remain understandable, we may need to renam the category based on the narrowed-down definition, for example using a term like RAG. What many rights holders care about are cases like AI summaries, translations, style imitations etc. The objective is not to interfere with individual users’ rights.

Leonhard: Terms like “prompt-time” or “RAG” could be workable, but the category predates RAG in the context of asset-based identifiers. If a declaring party says “you cannot use this as a style reference”, they should be able to express that preference and incorporate it into the asset.

Timid Robot: User prompt example is a UX problem, not an AI preference problem. Vocabulary is not an access control mechanism.

Felix: If a TDM category is included, some commercial entities will treat it as a legal directive and restrict downstream uses that end users may actually be allowed to perform based on other copyright exceptions.

Leonhard: The market will address demands for AI systems that down restrict users’ rights

Mark: We should be explicit about these possible effects

Paul: It may be out of scope to tell third parties what consequences to draw from voluntary preferences, but we should improve the definitions to limit unintended consequences.

Suresh: Should preferences be carried forward?

Attachment draft - issue discussion, draft review

backwards compatibility with existing robots.txt is an important design consideration

How does the attachment mechanism resolve hierarchy or conflicting information, e.g. robots.txt declares disallow, but content-usage for specific uses is set to y?

Conclusion & next steps