pub struct Ngrams {
    pub anchor: Option<String>,
    pub window_size: usize,
    pub n: usize,
    pub counts: HashMap<String, usize>,
}
Expand description

Ngrams are dictionaries with

Fields

anchor: Option<String>

anchor word that must be present in all ngram contexts (in their window)

window_size: usize

if an anchor word is given, word window size, applied to the left and to the right of the anchor word

n: usize

n-grams for a sequence of n words

counts: HashMap<String, usize>

statistics hashmap for the occurence counts

Implementations

Get the word count

count a newly seen ngram phrase

obtain the ngram report, sorted by descending frequency

get the number of distinct ngrams recorded

add content for ngram analysis, typically a paragraph or a line of text

In essence, for a given window size W, a word at index i is justified to participate in the ngrams if there is an instance of an anchor word in the range of words [i-W, i+W]. this can be highly irregular e.g. “word word anchor word anchor word word”, so we record flexibly looking for no-justification cutoffs, where a continuous word sequence is recorded for ngram counts

Take an arbitrarily long vector of words, and record all (overlapping) ngrams obtainable from it

Trait Implementations

Returns the “default value” for a type. Read more

Auto Trait Implementations

Blanket Implementations

Gets the TypeId of self. Read more
Immutably borrows from an owned value. Read more
Mutably borrows from an owned value. Read more

Returns the argument unchanged.

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

The alignment of pointer.
The type for initializers.
Initializes a with the given initializer. Read more
Dereferences the given pointer. Read more
Mutably dereferences the given pointer. Read more
Drops the object pointed to by the given pointer. Read more
The type returned in the event of a conversion error.
Performs the conversion.
The type returned in the event of a conversion error.
Performs the conversion.