public class CachingNaiveBayesClassifier extends SimpleNaiveBayesClassifier
http://en.wikipedia.org/wiki/Naive_Bayes_classifier
This is NOT an online classifier.
analyzer, classFieldName, indexReader, indexSearcher, query, textFieldNames| Constructor and Description |
|---|
CachingNaiveBayesClassifier(IndexReader indexReader,
Analyzer analyzer,
Query query,
String classFieldName,
String... textFieldNames)
Creates a new NaiveBayes classifier with inside caching.
|
| Modifier and Type | Method and Description |
|---|---|
protected List<ClassificationResult<BytesRef>> |
assignClassNormalizedList(String inputDocument)
Calculate probabilities for all classes for a given input text
|
void |
reInitCache(int minTermOccurrenceInCache,
boolean justCachedTerms)
This function is building the frame of the cache.
|
assignClass, countDocsWithClass, getClasses, getClasses, normClassificationResults, tokenizepublic CachingNaiveBayesClassifier(IndexReader indexReader, Analyzer analyzer, Query query, String classFieldName, String... textFieldNames)
reInitCache().indexReader - the reader on the index to be used for classificationanalyzer - an Analyzer used to analyze unseen textquery - a Query to eventually filter the docs used for training the classifier, or null
if all the indexed docs should be usedclassFieldName - the name of the field used as the output for the classifiertextFieldNames - the name of the fields used as the inputs for the classifierprotected List<ClassificationResult<BytesRef>> assignClassNormalizedList(String inputDocument) throws IOException
SimpleNaiveBayesClassifierassignClassNormalizedList in class SimpleNaiveBayesClassifierinputDocument - the input text as a StringList of ClassificationResult, one for each existing classIOException - if assigning probabilities failspublic void reInitCache(int minTermOccurrenceInCache,
boolean justCachedTerms)
throws IOException
minTermOccurrenceInCache - Lower cache size with higher value.justCachedTerms - The switch for fully exclude low occurrence docs.IOException - If there is a low-level I/O error.Copyright © 2000-2024 Apache Software Foundation. All Rights Reserved.