基于知识增强预训练语言模型的司法文本摘要生成技术研究
DOI:
作者:
作者单位:

中国人民公安大学

作者简介:

通讯作者:

中图分类号:

TP391.1

基金项目:

国家重点研发计划(2020AAA0107705)


Research on Judicial Text Summarization Based on Knowledge-Enhanced Pretrained Language Models
Author:
Affiliation:

1.People'2.'3.s Public Security University of China

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    随着自然语言处理技术的发展,文本技术已经被广泛应用在生活的方方面面,并且发挥着重要的作用。在司法领域,人工智能促使司法向信息化、智能化发展,其中司法文本在这一发展进程中承担着重要作用,对司法文本进行处理能够实现“降维”的目的,对迅速了解案件详情,获取案件要素有很大的帮助。但是现有的生成模型应用在司法文本上,生成的质量不尽如人意,还存在着生成重复、冗余,与现实情况不相符等问题,特别是当行为人存在多项罪名和多项判罚时,使用常见生成模型生成的会出现罪罚不匹配的情况。为了解决这些问题,提出基于知识增强预训练模型的司法文本生成模型——LCSG-ERNIE(Legal Case Summary Generation Based on Enhanced language Representation with iNformatIve Entities),该模型在预训练语言模型中融入司法知识,并结合对比学习的思想生成,最终通过实验证明提出的模型取得了较好效果。

    Abstract:

    With the advancement of natural language processing technology, text summarization techniques have been widely applied in various aspects of life and play an important role. In the field of judiciary, artificial intelligence has facilitated the development of the judicial system towards informatization and intelligence. Judicial text summarization plays a crucial role in this process by reducing the dimensionality of legal texts, enabling quick understanding of case details, and extracting key elements. However, existing summarization models applied to legal texts often suffer from issues such as poor quality, repetitive and redundant content, and inconsistency with real-world situations. Particularly, when dealing with cases involving multiple charges and penalties against defendants, common summarization models may generate summaries that do not match the actual legal consequences. To address these challenges, a research introduces a legal text summarization model called LCSG-ERNIE (Legal Case Summary Generation Based on Enhanced language Representation with Informative Entities), which incorporates legal knowledge into a pre-trained language model and leverages contrastive learning to generate summaries. Experimental results demonstrate that the proposed model achieves promising performance.

    参考文献
    相似文献
    引证文献
引用本文

裴炳森,李欣,胡凯茜,等. 基于知识增强预训练语言模型的司法文本摘要生成技术研究[J]. 科学技术与工程, , ():

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2023-07-04
  • 最后修改日期:2023-11-03
  • 录用日期:2023-11-14
  • 在线发布日期:
  • 出版日期:
×
亟待确认版面费归属稿件,敬请作者关注
《科学技术与工程》入选维普《中文科技期刊数据库》自然科学类期刊月度下载排行榜TOP10