| Class | Description |
|---|---|
| DecompoundToken |
A token that was generated from a compound.
|
| DictionaryToken |
A token stored in a
Dictionary. |
| GraphvizFormatter |
Outputs the dot (graphviz) string for the viterbi lattice.
|
| KoreanAnalyzer |
Analyzer for Korean that uses morphological analysis.
|
| KoreanNumberFilter |
A
TokenFilter that normalizes Korean numbers to regular Arabic
decimal numbers in half-width characters. |
| KoreanNumberFilter.NumberBuffer |
Buffer that holds a Korean number string and a position index used as a parsed-to marker
|
| KoreanNumberFilterFactory |
Factory for
KoreanNumberFilter. |
| KoreanPartOfSpeechStopFilter |
Removes tokens that match a set of part-of-speech tags.
|
| KoreanPartOfSpeechStopFilterFactory |
Factory for
KoreanPartOfSpeechStopFilter. |
| KoreanReadingFormFilter |
Replaces term text with the
ReadingAttribute which is
the Hangul transcription of Hanja characters. |
| KoreanReadingFormFilterFactory |
Factory for
KoreanReadingFormFilter. |
| KoreanTokenizer |
Tokenizer for Korean that uses morphological analysis.
|
| KoreanTokenizerFactory |
Factory for
KoreanTokenizer. |
| POS |
Part of speech classification for Korean based on Sejong corpus classification.
|
| Token |
Analyzed token with morphological data.
|
| Enum | Description |
|---|---|
| KoreanTokenizer.DecompoundMode |
Decompound mode: this determines how the tokenizer handles
POS.Type.COMPOUND, POS.Type.INFLECT and POS.Type.PREANALYSIS tokens. |
| KoreanTokenizer.Type |
Token type reflecting the original source of this token
|
| POS.Tag |
Part of speech tag for Korean based on Sejong corpus classification.
|
| POS.Type |
The type of the token.
|
Copyright © 2000-2024 Apache Software Foundation. All Rights Reserved.