Legal and Ethical Issues in Human Language Technologies

full-day Workshop at LREC-Coling 2024, Turin, Italy, May 20, 2024

Image Copyright:

Submission System

About the Workshop

LEGAL 2024

2023 is likely to be remembered as a year dominated by discussions about Artificial Intelligence (AI) and Large Language Models (LLM). These technologies require data to be collected and utilized in unprecedented amounts. Large sets of Language data are owned by stakeholders that are not necessarily involved in the development of such technologies. To use these sets for AI and LLM, it is essential to repackage and repurpose them for such endeavor. Language data, despite their intangible nature, are often subject to legal constraints which need to be addressed in order to guarantee lawful access to and re-use of these data. In recent years, considerable efforts have been made to adapt legal frameworks to the advancements in technology while taking into account the interests of various stakeholders. From the technological perspective, the strict consideration of legal aspects imposes further questions besides pure recording technology and participant consent. This arises in several key elements:

What is the Intellectual Proprietary status of Large Language sets, the corresponding Large Language Models, and their potential outputs? How can identifying information used in deep learning be removed or anonymized (and is this mandatory), how reliable are predictions/ models based on anonymized data? Which impact does this have on the usability, computational costs?

The purpose of this full-day workshop is to build bridges between technology and legal framework, and discuss current legal and ethical issues in the human language technology sector.

What to submit?

1500-2000 words extended abstracts (by 04 March 2024) are needed at first for submission. The full papers will be published as workshop proceedings along with the LREC-Coling main conference. For these, the instructions of the main conference need to be followed. Submit via Softconf

Topics of interest include:

  • Impact of statutory exceptions on text and speech data mining practices in the field of Human Language Technologies.
  • Impact of the regulatory environment at the international level (e.g. EU Data Act, Digital Governance Act, Digital Services Act, AI Act; the Chinese “2023 draft rules on generative AI”, the USA Blueprint for an AI Bill of Rights and other international or national regulations) on the circulation and use of language data.
  • Legal issues related to the production and use of Large Language Models (Intellectual Property, Data Governance and Data Protection aspects).
  • Concrete applications as to how language technologies can help resolve legal issues related to data collection, data sharing and data reuse.
  • Ethical considerations related to personal data collection and re-use
  • Trust and transparency in language and speech technologies
  • Efficient anonymization techniques, and the related responsibility, and their impact on usability and performance
  • Re-identification issues/De-anonymization approaches and techniques
  • Harmonizing differing perspectives of data scientists and legal experts, worldwide

Important Dates


March 04, 2024

Deadline for submission of extended abstracts

March 30, 2024

Notification of acceptance

April 05, 2024

Submission of final version of accepted papers

May 20, 2024

Workshop Day

Organizers and Contact of the LEGAL Workshop:

Ingo Siegert, Otto-von-Guericke-Universität Magdeburg, Germany

Khalid Choukri, ELRA/ELDA, France

Pawel Kamocki, IDS Mannheim, Germany

Program Committee

Khalid Choukri,

Mickaël Rigault,

Claudia Cevenini,

Erik Ketzan,

Prodromos Tsiavos,

Andreas Witt,

Paweł Kamocki,

Kim Nayyer,

Krister Lindèn,

Ingo Siegert,

Tom Bäckström,

Nicholas Evans,

Catherine Jasserand,

Isabel Trancoso