com.wcohen.secondstring
Class Mixture
java.lang.Object
|
+--com.wcohen.secondstring.AbstractStringDistance
|
+--com.wcohen.secondstring.AbstractStatisticalTokenDistance
|
+--com.wcohen.secondstring.Mixture
- All Implemented Interfaces:
- StringDistance
- public class Mixture
- extends AbstractStatisticalTokenDistance
Mixture-based distance metric.
| Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Mixture
public Mixture(Tokenizer tokenizer)
Mixture
public Mixture()
score
public double score(StringWrapper s,
StringWrapper t)
- Distance is argmax_lambda prod_{w in s} lambda Pr(w|t) * (1-lambda) Pr(w|background).
This is computed with E/M.
- Specified by:
score in interface StringDistance- Specified by:
score in class AbstractStringDistance
prepare
public StringWrapper prepare(java.lang.String s)
- Preprocess a string by finding tokens and giving them Mixture weights
- Specified by:
prepare in interface StringDistance- Overrides:
prepare in class AbstractStringDistance
explainScore
public java.lang.String explainScore(StringWrapper s,
StringWrapper t)
- Explain how the distance was computed.
In the output, the tokens in S and T are listed, and the
common tokens are marked with an asterisk.
- Specified by:
explainScore in interface StringDistance- Specified by:
explainScore in class AbstractStringDistance
toString
public java.lang.String toString()
- Overrides:
toString in class java.lang.Object
main
public static void main(java.lang.String[] argv)