|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectkr.ac.kaist.swrc.jhannanum.plugin.SupplementPlugin.PlainTextProcessor.InformalSentenceFilter.InformalSentenceFilter
public class InformalSentenceFilter
This plug-in filters informal sentences in which an eojeol is quite long and some characters were repeated many times. These informal patterns occur poor performance of morphological analysis so this plug-in should be used in HanNanum work flow which will analyze documents with informal sentences. It is a Plain Text Processor plug-in which is a supplement plug-in of phase 1 in HanNanum work flow.
Field Summary | |
---|---|
private static int |
REPEAT_CHAR_ALLOWED
the maximum number of repetition of a character allowed |
Constructor Summary | |
---|---|
InformalSentenceFilter()
|
Method Summary | |
---|---|
PlainSentence |
doProcess(PlainSentence ps)
It recognizes informal sentences in which an eojeol is quite long and some characters were repeated many times. |
PlainSentence |
flush()
It returns the text which has been stored in the internal buffer. |
boolean |
hasRemainingData()
It checks if there are some remaining text. |
void |
initialize(java.lang.String baseDir,
java.lang.String configFile)
This method is called before the work flow starts in order to initialize the plug-in. |
void |
shutdown()
This method is called before the work flow is closed. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
private static final int REPEAT_CHAR_ALLOWED
Constructor Detail |
---|
public InformalSentenceFilter()
Method Detail |
---|
public PlainSentence doProcess(PlainSentence ps)
doProcess
in interface PlainTextProcessor
ps
- - the plain text
public void initialize(java.lang.String baseDir, java.lang.String configFile) throws java.io.FileNotFoundException, java.io.IOException
Plugin
initialize
in interface Plugin
baseDir
- - the base directory of HanNanum filesconfigFile
- - the path for the configuration file
java.io.FileNotFoundException
java.io.IOException
public PlainSentence flush()
PlainTextProcessor
flush
in interface PlainTextProcessor
public void shutdown()
Plugin
shutdown
in interface Plugin
public boolean hasRemainingData()
PlainTextProcessor
hasRemainingData
in interface PlainTextProcessor
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |