|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Object
|
+--com.wcohen.secondstring.AbstractStringDistance
|
+--com.wcohen.secondstring.JensenShannonDistance
Distance metrics based on Jensen-Shannon distance of two smoothed unigram language models.
| Constructor Summary | |
JensenShannonDistance()
|
|
JensenShannonDistance(Tokenizer tokenizer)
|
|
| Method Summary | |
void |
accumulateStatistics(java.util.Iterator i)
Accumulate statistics on how often each token occurs. |
protected double |
backgroundProb(Token tok)
Probability of token in the background language model |
java.lang.String |
explainScore(StringWrapper s,
StringWrapper t)
This method needs to be implemented by subclasses. |
StringWrapper |
prepare(java.lang.String s)
Preprocess a string by finding tokens and giving them weights W such that W is the smoothed probability of the token appearing in the document. |
double |
score(StringWrapper s,
StringWrapper t)
Jensen-Shannon distance between distributions. |
protected abstract double |
smoothedProbability(Token tok,
double freq,
double totalWeight)
Smoothed probability of the token with frequency freq in a bag with the given totalWeight |
| Methods inherited from class com.wcohen.secondstring.AbstractStringDistance |
doMain, explainScore, score |
| Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Constructor Detail |
public JensenShannonDistance(Tokenizer tokenizer)
public JensenShannonDistance()
| Method Detail |
public final void accumulateStatistics(java.util.Iterator i)
accumulateStatistics in interface StringDistanceaccumulateStatistics in class AbstractStringDistancepublic final StringWrapper prepare(java.lang.String s)
prepare in interface StringDistanceprepare in class AbstractStringDistance
protected abstract double smoothedProbability(Token tok,
double freq,
double totalWeight)
protected double backgroundProb(Token tok)
public final double score(StringWrapper s,
StringWrapper t)
score in interface StringDistancescore in class AbstractStringDistance
public final java.lang.String explainScore(StringWrapper s,
StringWrapper t)
AbstractStringDistance
explainScore in interface StringDistanceexplainScore in class AbstractStringDistance
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||