Data-to-text generation is a greatly significant task in the field of natural language
processing. FORGe, a typical rule-based generator, has excellent performance in
the task of mapping from RDF triples to text. As this generator is strongly reliant
on rules, despite the fact that the generated text is with highly semantic accuracy,
and faithful to the input RDF triples, the generated sentence could be somewhat
rigid in terms of fluency.
This thesis would like to explore a possible way to ...
Data-to-text generation is a greatly significant task in the field of natural language
processing. FORGe, a typical rule-based generator, has excellent performance in
the task of mapping from RDF triples to text. As this generator is strongly reliant
on rules, despite the fact that the generated text is with highly semantic accuracy,
and faithful to the input RDF triples, the generated sentence could be somewhat
rigid in terms of fluency.
This thesis would like to explore a possible way to improve the fluency of output
text generated by FORGe. A neural paraphrase method is suggested to act as a
post-processing method to achieve our goal. This method can control the tradeoff
between two models, the fluency and semantic similarity model, and the lexical
and/or syntactic diversity model by setting a parameter . In this way, it can not
only make the output semantically consistent with the input, but also diversify the
lexical and syntactic items of the sentence. In order to verify our idea, we designed
and conducted related experiments.
Furthermore, the performance of deep-learning based generator OSU Neural NLG,
which also performs well in English D2T tasks, is considered as a baseline. Since
we need to evaluate all the generated text, an automatic evaluation method is used
in order to ensure an uniform evaluation criterion on these outputs in terms of
semantic accuracy. Based on the result of this automatic evaluation method, we
also conducted manual verification to make the evaluation result more reliable and
have reference value. Through our experimental results, we believe that applying
this neural paraphrase method as a post-processing stage is promising in improving
the fluency of the output text generated by FORGe.
+