|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.ObjectAnalyzer
com.basistech.rlp.lucene.RLPAnalyzer
com.basistech.rlp.lucene.RLPEnAnalyzer
public class RLPEnAnalyzer
An Analyzer for English that uses RLP.
To use this analyzer, you must have a valid RLP license that enables the
Base Linguistics language processor for European languages (BL1 LP).
This Analyzer uses RLPTokenizer
, LowerCaseFilter
,
and RLPPOSFilter
(only if POS generation is turned on and the allowed POS tag list is provided).
Note: Although this is currently implemented as a subclass of RLPAnalyzer
,
this is regarded as an implementation detail, and it may change in the future.
The eventual contract is that it is a subclass of Lucene Analyzer
.
Constructor Summary | |
---|---|
RLPEnAnalyzer()
This default constructor uses a default RLP Context, which only includes BL1 LP, along with the default set of post types, and the default POS tags for English processing. |
|
RLPEnAnalyzer(String rlpContextDef)
This constructor uses default set of the post types, which are STEM (which is actually a lemma) POS (part-of-speech in Token's payload field) |
|
RLPEnAnalyzer(String rlpContextDef,
EnumSet<RLPTokenizer.PostType> postTypes)
This constructor uses the part-of-speech filter with the default part-of-speech tag set. |
|
RLPEnAnalyzer(String rlpContextDef,
EnumSet<RLPTokenizer.PostType> postTypes,
String[] allowedPOSTags)
This constructor does not use default values. |
Method Summary | |
---|---|
static String[] |
getDefaultAllowedPOSTags()
Gets the array of part-of-speech (POS) tags that is assumed when constructor without such argument is used. |
static String |
getDefaultContextDefinition()
Gets the default context definition, which only contains the BL1 LP. |
static EnumSet<RLPTokenizer.PostType> |
getDefaultPostTypes()
Gets the set of post types that is assumed when a constructor without such argument is used. |
static void |
main(String[] args)
(Internal use only) Tokenizes an English sentence and displays the results. |
Methods inherited from class com.basistech.rlp.lucene.RLPAnalyzer |
---|
getDetectedLanguage, tokenStream |
Methods inherited from class java.lang.Object |
---|
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public RLPEnAnalyzer(String rlpContextDef, EnumSet<RLPTokenizer.PostType> postTypes, String[] allowedPOSTags)
rlpContextDef
- Context definition that RLP uses to process text: an XML string or path to XML file.postTypes
- RLP Result types for which the tokenizer will generate tokens.allowedPOSTags
- POSTagFilter will accept tokens with these POS tags.RLPAnalyzer.RLPAnalyzer(LanguageCode, String, EnumSet, String[])
public RLPEnAnalyzer(String rlpContextDef, EnumSet<RLPTokenizer.PostType> postTypes)
rlpContextDef
- Context definition that RLP uses to process text: an XML string or path to XML file.postTypes
- RLP Result types for which the tokenizer will generate tokens.RLPAnalyzer.RLPAnalyzer(LanguageCode, String, EnumSet)
public RLPEnAnalyzer(String rlpContextDef)
rlpContextDef
- Context definition that RLP uses to process text: an XML string or path to XML file.RLPAnalyzer.RLPAnalyzer(LanguageCode, String)
public RLPEnAnalyzer()
RLPAnalyzer.RLPAnalyzer(LanguageCode)
Method Detail |
---|
public static String getDefaultContextDefinition()
public static String[] getDefaultAllowedPOSTags()
public static EnumSet<RLPTokenizer.PostType> getDefaultPostTypes()
public static void main(String[] args)
args
- The English sentence to process (arg[0]). If you do not include an arg, a
default sentence is processed.
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |