| Class | Description |
|---|---|
| ExtractReuters |
Split the Reuters SGML documents into Simple Text files containing: Title, Date, Dateline, Body
|
| ExtractWikipedia |
Extract the downloaded Wikipedia dump into separate files for indexing.
|
Copyright © 2000-2024 Apache Software Foundation. All Rights Reserved.