Function llamapun::util::data_helpers::ams_normalize_word_range[][src]

pub fn ams_normalize_word_range(
    range: &DNMRange<'_>,
    context: &mut Context,
    options: LexicalOptions
) -> Result<String, Box<dyn Error>>
Expand description

Normalization of word lexemes created for the “AMS paragraph classification” experiment operating on a DNMRange representation

  • numeric literals are replaced by NUM
  • citations become citationelement
  • math is replaced by its lexeme annotation (created by latexml), with a “mathformula” fallback
  • of the word is longer than the max length of 25, an error is returned