com.wcohen.ss
Class AbstractStringDistance

java.lang.Object
  extended by com.wcohen.ss.AbstractStringDistance
All Implemented Interfaces:
StringDistance, StringDistanceLearner
Direct Known Subclasses:
AbstractTokenizedStringDistance, AffineGap, ApproxNeedlemanWunsch, Jaccard, Jaro, NeedlemanWunsch, SmithWaterman, TagLinkToken, WinklerRescorer

public abstract class AbstractStringDistance
extends java.lang.Object
implements StringDistance, StringDistanceLearner

Abstract class which implements StringDistanceLearner as well as StringDistance. The abstract class provides default implementations of most of the StringDistanceLearner functions, making it easy to implement StringDistances which do little or no learning.


Constructor Summary
AbstractStringDistance()
           
 
Method Summary
 void addExample(DistanceInstance answeredQuery)
          Implements StringDistanceLearner api by accepting new DistanceInstance labels.
protected static void doMain(StringDistance d, java.lang.String[] argv)
          Default main routine for testing
 java.lang.String explainScore(java.lang.String s, java.lang.String t)
          Scores are explained by converting Strings to StringWrappers with the prepare function.
abstract  java.lang.String explainScore(StringWrapper s, StringWrapper t)
          This method needs to be implemented by subclasses.
 StringDistance getDistance()
          Implements the StringDistanceLearner api by return a StringDistance.
 boolean hasNextQuery()
          Implements StringDistanceLearner api by informing a teacher if the learner has DistanceInstance queries.
 DistanceInstance nextQuery()
          Implements StringDistanceLearner api by querying for DistanceInstance labels.
 DistanceInstanceIterator prepare(DistanceInstanceIterator i)
          Implements StringDistanceLearner api by providing a way to prep a DistanceInstanceIterator for training.
 StringWrapper prepare(java.lang.String s)
          Default way to preprocess a string for distance computation.
 StringWrapperIterator prepare(StringWrapperIterator i)
          Implements StringDistanceLearner api by providing a way to prep a StringWrapperIterator for training.
 double score(java.lang.String s, java.lang.String t)
          Strings are scored by converting them to StringWrappers with the prepare function.
abstract  double score(StringWrapper s, StringWrapper t)
          This method needs to be implemented by subclasses.
 void setDistanceInstancePool(DistanceInstanceIterator i)
          Implements StringDistanceLearner api by providing a way to accept a pool of unlabeled DistanceInstance's.
 void setStringWrapperPool(StringWrapperIterator i)
          Implements the StringDistanceLearner api, by providing a way to accumulate statistics for a set of related strings.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

AbstractStringDistance

public AbstractStringDistance()
Method Detail

score

public abstract double score(StringWrapper s,
                             StringWrapper t)
This method needs to be implemented by subclasses.

Specified by:
score in interface StringDistance

explainScore

public abstract java.lang.String explainScore(StringWrapper s,
                                              StringWrapper t)
This method needs to be implemented by subclasses.

Specified by:
explainScore in interface StringDistance

score

public final double score(java.lang.String s,
                          java.lang.String t)
Strings are scored by converting them to StringWrappers with the prepare function.

Specified by:
score in interface StringDistance

explainScore

public final java.lang.String explainScore(java.lang.String s,
                                           java.lang.String t)
Scores are explained by converting Strings to StringWrappers with the prepare function.

Specified by:
explainScore in interface StringDistance

prepare

public StringWrapper prepare(java.lang.String s)
Default way to preprocess a string for distance computation. If this is an expensive operations, then override this method to return a StringWrapper implementation that caches appropriate information about s.

Specified by:
prepare in interface StringDistance

setStringWrapperPool

public void setStringWrapperPool(StringWrapperIterator i)
Implements the StringDistanceLearner api, by providing a way to accumulate statistics for a set of related strings. This is for distance metrics like TFIDF that use statistics on unlabeled strings to adjust a distance metric. The Default is to do nothing; override this method if it's necessary to accumulate statistics.

Specified by:
setStringWrapperPool in interface StringDistanceLearner

setDistanceInstancePool

public void setDistanceInstancePool(DistanceInstanceIterator i)
Implements StringDistanceLearner api by providing a way to accept a pool of unlabeled DistanceInstance's. Default is to not use this information.

Specified by:
setDistanceInstancePool in interface StringDistanceLearner

hasNextQuery

public boolean hasNextQuery()
Implements StringDistanceLearner api by informing a teacher if the learner has DistanceInstance queries. Default is to make no queries.

Specified by:
hasNextQuery in interface StringDistanceLearner

nextQuery

public DistanceInstance nextQuery()
Implements StringDistanceLearner api by querying for DistanceInstance labels.

Specified by:
nextQuery in interface StringDistanceLearner

addExample

public void addExample(DistanceInstance answeredQuery)
Implements StringDistanceLearner api by accepting new DistanceInstance labels.

Specified by:
addExample in interface StringDistanceLearner

prepare

public StringWrapperIterator prepare(StringWrapperIterator i)
Implements StringDistanceLearner api by providing a way to prep a StringWrapperIterator for training. By default this makes no changes to the iterator.

Specified by:
prepare in interface StringDistanceLearner

prepare

public DistanceInstanceIterator prepare(DistanceInstanceIterator i)
Implements StringDistanceLearner api by providing a way to prep a DistanceInstanceIterator for training. By default this makes no changes to the iterator.

Specified by:
prepare in interface StringDistanceLearner

getDistance

public StringDistance getDistance()
Implements the StringDistanceLearner api by return a StringDistance. By default, returns this object, which also implements StringDistance.

Specified by:
getDistance in interface StringDistanceLearner

doMain

protected static final void doMain(StringDistance d,
                                   java.lang.String[] argv)
Default main routine for testing