com.wcohen.secondstring
Class AbstractStringDistance

java.lang.Object
  |
  +--com.wcohen.secondstring.AbstractStringDistance
All Implemented Interfaces:
StringDistance
Direct Known Subclasses:
AbstractStatisticalTokenDistance, AffineGap, Jaccard, Jaro, JensenShannonDistance, Level2, NeedlemanWunsch, SLIM, SmithWaterman, WinklerRescorer

public abstract class AbstractStringDistance
extends java.lang.Object
implements StringDistance

Abstract StringDistance implementation, implementing a few useful defaults.


Constructor Summary
AbstractStringDistance()
           
 
Method Summary
 void accumulateStatistics(java.util.Iterator i)
          Default way to accumulate statistics for a set of related strings.
protected static void doMain(StringDistance d, java.lang.String[] argv)
          Default main routine for testing
 java.lang.String explainScore(java.lang.String s, java.lang.String t)
          Scores are explained by converting Strings to StringWrappers with the prepare function.
abstract  java.lang.String explainScore(StringWrapper s, StringWrapper t)
          This method needs to be implemented by subclasses.
 StringWrapper prepare(java.lang.String s)
          Default way to preprocess a string for distance computation.
 double score(java.lang.String s, java.lang.String t)
          Strings are scored by converting them to StringWrappers with the prepare function.
abstract  double score(StringWrapper s, StringWrapper t)
          This method needs to be implemented by subclasses.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

AbstractStringDistance

public AbstractStringDistance()
Method Detail

score

public abstract double score(StringWrapper s,
                             StringWrapper t)
This method needs to be implemented by subclasses.

Specified by:
score in interface StringDistance

explainScore

public abstract java.lang.String explainScore(StringWrapper s,
                                              StringWrapper t)
This method needs to be implemented by subclasses.

Specified by:
explainScore in interface StringDistance

score

public final double score(java.lang.String s,
                          java.lang.String t)
Strings are scored by converting them to StringWrappers with the prepare function.

Specified by:
score in interface StringDistance

explainScore

public final java.lang.String explainScore(java.lang.String s,
                                           java.lang.String t)
Scores are explained by converting Strings to StringWrappers with the prepare function.

Specified by:
explainScore in interface StringDistance

prepare

public StringWrapper prepare(java.lang.String s)
Default way to preprocess a string for distance computation. If this is an expensive operations, then override this method to return a StringWrapper implementation that caches appropriate information about s.

Specified by:
prepare in interface StringDistance

accumulateStatistics

public void accumulateStatistics(java.util.Iterator i)
Default way to accumulate statistics for a set of related strings. This is for distance metrics like TFIDF that use statistics on unlabeled strings to adjust a distance metric. Override this method if it's necessary to accumulate statistics.

Specified by:
accumulateStatistics in interface StringDistance

doMain

protected static final void doMain(StringDistance d,
                                   java.lang.String[] argv)
Default main routine for testing