stemmer
Class SnowballStemmer
java.lang.Object
|
+--gate.util.AbstractFeatureBearer
|
+--gate.creole.AbstractResource
|
+--gate.creole.AbstractProcessingResource
|
+--gate.creole.AbstractLanguageAnalyser
|
+--stemmer.SnowballStemmer
- All Implemented Interfaces:
- gate.creole.ANNIEConstants, gate.Executable, gate.util.FeatureBearer, gate.LanguageAnalyser, gate.util.NameBearer, gate.ProcessingResource, gate.Resource, java.io.Serializable
- public class SnowballStemmer
- extends gate.creole.AbstractLanguageAnalyser
- implements gate.ProcessingResource
Title: SnowballStemmer.java
Description: This class is a wrapper for the
Snowball stemmers (see http://snowball.tartarus.org/) for
- Danish,
- Dutch,
- English,
- Finnish,
- French,
- German,
- Italian,
- Norwegian,
- Portuguese,
- Russian,
- Spanish,
- Swedish.
The Stemmer process already created annotations containing string feature,
as these created by the Tokeniser.
Generally it takes the (string) value form the parameter annotationFeature
of annotationType
parameter
in annotationSetName
parameter,
stems it and
adds new feature "stem" into the same annotation type.
The default parameters are set to process Token.string values
form the Default annotation set, produced by the Tokeniser.
- Version:
- 1.0
- Author:
- Milena Yankova
- See Also:
- Serialized Form
Inner classes inherited from class gate.creole.AbstractProcessingResource |
gate.creole.AbstractProcessingResource.InternalStatusListener, gate.creole.AbstractProcessingResource.IntervalProgressListener |
Fields inherited from class gate.creole.AbstractLanguageAnalyser |
corpus, document |
Fields inherited from class gate.creole.AbstractProcessingResource |
interrupted |
Fields inherited from class gate.creole.AbstractResource |
name, serialVersionUID |
Fields inherited from class gate.util.AbstractFeatureBearer |
features |
Fields inherited from interface gate.creole.ANNIEConstants |
ANNOTATION_COREF_FEATURE_NAME, DATE_ANNOTATION_TYPE, DOCUMENT_COREF_FEATURE_NAME, LOCATION_ANNOTATION_TYPE, LOOKUP_ANNOTATION_TYPE, LOOKUP_CLASS_FEATURE_NAME, LOOKUP_MAJOR_TYPE_FEATURE_NAME, LOOKUP_MINOR_TYPE_FEATURE_NAME, LOOKUP_ONTOLOGY_FEATURE_NAME, MONEY_ANNOTATION_TYPE, ORGANIZATION_ANNOTATION_TYPE, PERSON_ANNOTATION_TYPE, PERSON_GENDER_FEATURE_NAME, PR_NAMES, SENTENCE_ANNOTATION_TYPE, SPACE_TOKEN_ANNOTATION_TYPE, TOKEN_ANNOTATION_TYPE, TOKEN_CATEGORY_FEATURE_NAME, TOKEN_KIND_FEATURE_NAME, TOKEN_LENGTH_FEATURE_NAME, TOKEN_ORTH_FEATURE_NAME, TOKEN_STRING_FEATURE_NAME |
Methods inherited from class gate.creole.AbstractLanguageAnalyser |
getCorpus, getDocument, setCorpus, setDocument |
Methods inherited from class gate.creole.AbstractProcessingResource |
addProgressListener, addStatusListener, cleanup, fireProcessFinished, fireProgressChanged, fireStatusChanged, interrupt, isInterrupted, reInit, removeProgressListener, removeStatusListener |
Methods inherited from class gate.creole.AbstractResource |
checkParameterValues, getName, getParameterValue, getParameterValue, removeResourceListeners, setName, setParameterValue, setParameterValue, setParameterValues, setParameterValues, setResourceListeners |
Methods inherited from class gate.util.AbstractFeatureBearer |
getFeatures, setFeatures |
Methods inherited from class java.lang.Object |
, clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Methods inherited from interface gate.ProcessingResource |
reInit |
Methods inherited from interface gate.Resource |
cleanup, getParameterValue, setParameterValue, setParameterValues |
Methods inherited from interface gate.util.FeatureBearer |
getFeatures, setFeatures |
Methods inherited from interface gate.util.NameBearer |
getName, setName |
Methods inherited from interface gate.Executable |
interrupt, isInterrupted |
Methods inherited from interface gate.creole.ANNIEConstants |
|
SNOW_STAM_DOCUMENT_PARAMETER_NAME
public static final java.lang.String SNOW_STAM_DOCUMENT_PARAMETER_NAME
- Document to be processed by the stemmer - Runtime
SNOW_STAM_ANNOT_SET_PARAMETER_NAME
public static final java.lang.String SNOW_STAM_ANNOT_SET_PARAMETER_NAME
- Name of the annotation set the stemmer will run over - Optional and Runtime
SNOW_STAM_ANNOT_TYPE_PARAMETER_NAME
public static final java.lang.String SNOW_STAM_ANNOT_TYPE_PARAMETER_NAME
- Name of the annotation type in the
annotationSetName
to be processed - Runtime
SNOW_STAM_ANNOT_FEATURE_PARAMETER_NAME
public static final java.lang.String SNOW_STAM_ANNOT_FEATURE_PARAMETER_NAME
- Name of the feature containing a string of the word to be stemmed - Runtime
SNOW_STAM_LANGUAGE_PARAMETER_NAME
public static final java.lang.String SNOW_STAM_LANGUAGE_PARAMETER_NAME
- Language that the stemmer will work on - Initialisation parameter
SnowballStemmer
public SnowballStemmer()
- Default constructor
init
public gate.Resource init()
throws gate.creole.ResourceInstantiationException
- Initialises this resource, and returns it
Checks if the language given as a parameter is supported.
- Specified by:
init
in interface gate.Resource
- Overrides:
init
in class gate.creole.AbstractProcessingResource
- Returns:
- Resource
- Throws:
gate.creole.ResourceInstantiationException
-
execute
public void execute()
throws gate.creole.ExecutionException
- Runs the appropriate Snowball stemmer for the given language
- Specified by:
execute
in interface gate.Executable
- Overrides:
execute
in class gate.creole.AbstractProcessingResource
- Throws:
gate.creole.ExecutionException
-
setLanguage
public void setLanguage(java.lang.String language)
getLanguage
public java.lang.String getLanguage()
setAnnotationSetName
public void setAnnotationSetName(java.lang.String annotationSetName)
getAnnotationSetName
public java.lang.String getAnnotationSetName()
setAnnotationType
public void setAnnotationType(java.lang.String annotationType)
getAnnotationType
public java.lang.String getAnnotationType()
setAnnotationFeature
public void setAnnotationFeature(java.lang.String annotationFeature)
getAnnotationFeature
public java.lang.String getAnnotationFeature()