曲靖师范学院学报 ›› 2021, Vol. 40 ›› Issue (6): 36-42.

• 计算机科学研究 • 上一篇    下一篇

基于T5的多项选择题自动生成模型研究

徐坚1,2, 孙瑜3, 张利明4   

  1. 1.云南师范大学 民族教育信息化教育部重点实验室,云南 昆明 650500;
    2.曲靖师范学院 信息工程学院,云南 曲靖 655011;
    3.云南师范大学 信息学院,云南 昆明 650500;
    4.曲靖师范学院 文化旅游学院,云南 曲靖 655011
  • 收稿日期:2021-09-21 出版日期:2021-11-26 发布日期:2021-12-10
  • 通讯作者: 孙 瑜,云南师范大学信息学院教授,主要从事知识工程研究.
  • 作者简介:徐 坚,曲靖师范学院信息工程学院副教授,主要从事知识图谱、自然语言处理和智慧教育研究.
  • 基金资助:
    国家自然科学基金项目“民族教育信息资源多模态知识融合与服务研究”(62166050).

An Automatic Generation Model of Multiple-Choice Questions Based on T5

XU Jian1,2, SUN Yu3, ZHANG Liming4   

  1. 1. Key Laboratory of Educational Informatization for Nationalities,Yunnan Normal University, Kunming Yunnan 650500;
    2. School of Information Engineering,Qujing Normal University, Qujing Yunnan 655011;
    3. School of Information,Yunnan Normal University,Kunming Yunnan 650500;
    4. School of Culture and Tourism,Qujing Normal University,Qujing Yunnan 655001,China
  • Received:2021-09-21 Published:2021-11-26 Online:2021-12-10

摘要: 多选题自动生成属自然语言处理领域难题,其子任务有问题生成、答案生成、干扰项生成等尖端任务,虽前人对三者均有研究,但鲜有文献将其整合.提出多选题自动生成模型,对模型的问题生成、答案生成和干扰项生成等开展理论分析,并基于T5开展自动问题生成和答案生成的实验.实验表明,本模型能根据用户输入文本生成多项选择题,问题质量比纯T5模型的BLEU指标高约1个点,其它指标也略有提升.鉴于干扰项生成属前沿难题,文章仅作理论分析.

关键词: 多项选择题, 问题生成, T5模型

Abstract: The automatic generation of multiple-choice questions, whose subtasks include cutting-edge tasks such as question generation, answer generation, and distractor generation, is a new difficulty in the NLP field. Although previous studies have done on these three subtasks, few studies have tried to unify them. A multiple-choice question automatic generation model was proposed, theoretical analysis was made on the model’s question generation, answer generation, and distractor generation modules, and experiments on automatic question generation and answer generation based on T5 was conducted. The experiments showed that this model can generate multiple-choice questions based on user input text, and the question quality was about 1 point higher than the BLEU against the pure T5 model, and other metrics are also slightly improved. A theoretical analysis was made on the generation of distractor.

Key words: multiple-choice questions, question generation, T5 model

中图分类号: