Expand description

A “corpus token model”-generation utilities

Functions

Parallel traversal of latexml-style HTML5 document corpora, based on jwalk and DNMParameter::llamapun_normalization with additional subformula lexemes via dnm::node::lexematize_math