Deutsch
 
Hilfe Datenschutzhinweis Impressum
  DetailsucheBrowse

Datensatz

DATENSATZ AKTIONENEXPORT

Freigegeben

Konferenzbeitrag

Simplify Your Law: Using Information Theory to Deduplicate Legal Documents

MPG-Autoren
/persons/resource/persons201891

Coupette,  Corinna
Business and Tax Law, MPI for Tax Law and Public Finance, Max Planck Society;

Externe Ressourcen
Es sind keine externen Ressourcen hinterlegt
Volltexte (beschränkter Zugriff)
Für Ihren IP-Bereich sind aktuell keine Volltexte freigegeben.
Volltexte (frei zugänglich)
Es sind keine frei zugänglichen Volltexte in PuRe verfügbar
Ergänzendes Material (frei zugänglich)
Es sind keine frei zugänglichen Ergänzenden Materialien verfügbar
Zitation

Coupette, C., Singh, J., & Spamann, H. (2021). Simplify Your Law: Using Information Theory to Deduplicate Legal Documents. In IEEE (Ed.), 2021 International Conference on Data Mining Workshops (ICDMW) (pp. 631-638).


Zitierlink: https://hdl.handle.net/21.11116/0000-0009-F742-6
Zusammenfassung
Textual redundancy is one of the main challenges to ensuring that legal texts remain comprehensible and maintainable. Drawing inspiration from the refactoring literature in software engineering, which has developed methods to expose and eliminate duplicated code, we introduce the duplicated phrase detection problem for legal texts and propose the Dupex algorithm to solve it. Leveraging the Minimum Description Length principle from information theory, Dupex identifies a set of duplicated phrases, called patterns, that together best compress a given input text. Through an extensive set of experiments on the Titles of the United States Code, we confirm that our algorithm works well in practice: Dupex will help you simplify your law.