|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
Compute the difference between pairs of strings.
For some types of distances, it is fine to simply create a
StringDistance object and then use it, e.g.,
new JaroWinkler().compare("frederic", "fredrick")
.
Other string metrics benefit from caching information about a
string, especially when many comparisons are many concerning the
same string. The prepare() method returns a StringWrapper
object, which can cache any appropriate information about the
String it 'wraps'. The most frequent use of caching here is saving
a tokenized version of a string (as a BagOfTokens, which is
a subclass of StringWrapper.)
Metrics like TFIDF discount matches on frequent tokens. These work best if given a set of strings over which statistics can be accumulated. The accumulateStatistics() method is how this is done.
Method Summary | |
void |
accumulateStatistics(java.util.Iterator i)
Accumulate statistics over a set of stringWrappers, which will be produced by an iterator. |
java.lang.String |
explainScore(java.lang.String s,
java.lang.String t)
Explain how the distance was computed. |
java.lang.String |
explainScore(StringWrapper s,
StringWrapper t)
Explain how the distance was computed. |
StringWrapper |
prepare(java.lang.String s)
Preprocess a string for distance computation |
double |
score(java.lang.String s,
java.lang.String t)
Find the distance between s and t |
double |
score(StringWrapper s,
StringWrapper t)
Find the distance between s and t. |
Method Detail |
public double score(StringWrapper s, StringWrapper t)
public double score(java.lang.String s, java.lang.String t)
public StringWrapper prepare(java.lang.String s)
public java.lang.String explainScore(StringWrapper s, StringWrapper t)
public java.lang.String explainScore(java.lang.String s, java.lang.String t)
public void accumulateStatistics(java.util.Iterator i)
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |