com.wcohen.secondstring
Class TFIDF
java.lang.Object
|
+--com.wcohen.secondstring.AbstractStringDistance
|
+--com.wcohen.secondstring.AbstractStatisticalTokenDistance
|
+--com.wcohen.secondstring.TFIDF
- All Implemented Interfaces:
- StringDistance
- Direct Known Subclasses:
- SoftTFIDF
- public class TFIDF
- extends AbstractStatisticalTokenDistance
TFIDF-based distance metric.
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
TFIDF
public TFIDF(Tokenizer tokenizer)
TFIDF
public TFIDF()
score
public double score(StringWrapper s,
StringWrapper t)
- Description copied from class:
AbstractStringDistance
- This method needs to be implemented by subclasses.
- Specified by:
score
in interface StringDistance
- Specified by:
score
in class AbstractStringDistance
prepare
public StringWrapper prepare(java.lang.String s)
- Preprocess a string by finding tokens and giving them TFIDF weights
- Specified by:
prepare
in interface StringDistance
- Overrides:
prepare
in class AbstractStringDistance
explainScore
public java.lang.String explainScore(StringWrapper s,
StringWrapper t)
- Explain how the distance was computed.
In the output, the tokens in S and T are listed, and the
common tokens are marked with an asterisk.
- Specified by:
explainScore
in interface StringDistance
- Specified by:
explainScore
in class AbstractStringDistance
toString
public java.lang.String toString()
- Overrides:
toString
in class java.lang.Object
main
public static void main(java.lang.String[] argv)