com.wcohen.secondstring
Class Mixture
java.lang.Object
|
+--com.wcohen.secondstring.AbstractStringDistance
|
+--com.wcohen.secondstring.AbstractStatisticalTokenDistance
|
+--com.wcohen.secondstring.Mixture
- All Implemented Interfaces:
- StringDistance
- public class Mixture
- extends AbstractStatisticalTokenDistance
Mixture-based distance metric.
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Mixture
public Mixture(Tokenizer tokenizer)
Mixture
public Mixture()
score
public double score(StringWrapper s,
StringWrapper t)
- Distance is argmax_lambda prod_{w in s} lambda Pr(w|t) * (1-lambda) Pr(w|background).
This is computed with E/M.
- Specified by:
score
in interface StringDistance
- Specified by:
score
in class AbstractStringDistance
prepare
public StringWrapper prepare(java.lang.String s)
- Preprocess a string by finding tokens and giving them Mixture weights
- Specified by:
prepare
in interface StringDistance
- Overrides:
prepare
in class AbstractStringDistance
explainScore
public java.lang.String explainScore(StringWrapper s,
StringWrapper t)
- Explain how the distance was computed.
In the output, the tokens in S and T are listed, and the
common tokens are marked with an asterisk.
- Specified by:
explainScore
in interface StringDistance
- Specified by:
explainScore
in class AbstractStringDistance
toString
public java.lang.String toString()
- Overrides:
toString
in class java.lang.Object
main
public static void main(java.lang.String[] argv)