public final class OpenNLPTokenizer extends SegmentingTokenizerBase
EOS_FLAG_BIT in the FlagsAttribute;
following filters can use this information to apply operations to tokens one sentence at a time.AttributeSource.State| Modifier and Type | Field and Description |
|---|---|
static int |
EOS_FLAG_BIT |
buffer, BUFFERMAX, offsetDEFAULT_TOKEN_ATTRIBUTE_FACTORY| Constructor and Description |
|---|
OpenNLPTokenizer(AttributeFactory factory,
NLPSentenceDetectorOp sentenceOp,
NLPTokenizerOp tokenizerOp) |
| Modifier and Type | Method and Description |
|---|---|
void |
close() |
protected boolean |
incrementWord() |
void |
reset() |
protected void |
setNextSentence(int sentenceStart,
int sentenceEnd) |
end, incrementToken, isSafeEndcorrectOffset, setReaderaddAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, endAttributes, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, removeAllAttributes, restoreState, toStringpublic OpenNLPTokenizer(AttributeFactory factory, NLPSentenceDetectorOp sentenceOp, NLPTokenizerOp tokenizerOp) throws IOException
IOExceptionpublic void close()
throws IOException
close in interface Closeableclose in interface AutoCloseableclose in class TokenizerIOExceptionprotected void setNextSentence(int sentenceStart,
int sentenceEnd)
setNextSentence in class SegmentingTokenizerBaseprotected boolean incrementWord()
incrementWord in class SegmentingTokenizerBasepublic void reset()
throws IOException
reset in class SegmentingTokenizerBaseIOExceptionCopyright © 2000-2024 Apache Software Foundation. All Rights Reserved.