Expand description
Stores auxiliary resources required by the tokenizer so that they need to be initialized only once
Fields
stopwords: HashSet<&'static str>
set of stopwords
abbreviations: Regex
regular expression for abbreviations
Implementations
sourceimpl Tokenizer
impl Tokenizer
sourcepub fn sentences<'a>(&self, dnm: &'a DNM) -> Vec<DNMRange<'a>>ⓘNotable traits for Vec<u8, A>impl<A> Write for Vec<u8, A>where
A: Allocator,
pub fn sentences<'a>(&self, dnm: &'a DNM) -> Vec<DNMRange<'a>>ⓘNotable traits for Vec<u8, A>impl<A> Write for Vec<u8, A>where
A: Allocator,
A: Allocator,
gets the sentences from a dnm
Trait Implementations
Auto Trait Implementations
impl RefUnwindSafe for Tokenizer
impl Send for Tokenizer
impl Sync for Tokenizer
impl Unpin for Tokenizer
impl UnwindSafe for Tokenizer
Blanket Implementations
sourceimpl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
const: unstable · sourcefn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more