com.wcohen.secondstring
Class TokenFelligiSunter

java.lang.Object
  |
  +--com.wcohen.secondstring.AbstractStringDistance
        |
        +--com.wcohen.secondstring.AbstractStatisticalTokenDistance
              |
              +--com.wcohen.secondstring.TokenFelligiSunter
All Implemented Interfaces:
StringDistance

public class TokenFelligiSunter
extends AbstractStatisticalTokenDistance

Highly simplified model of Felligi-Sunter's method 1, applied to tokens.


Field Summary
 
Fields inherited from class com.wcohen.secondstring.AbstractStatisticalTokenDistance
collectionSize, documentFrequency, tokenizer, totalTokenCount
 
Constructor Summary
TokenFelligiSunter()
           
TokenFelligiSunter(Tokenizer tokenizer, double mismatchFactor)
           
 
Method Summary
 java.lang.String explainScore(StringWrapper s, StringWrapper t)
          Explain how the distance was computed.
static void main(java.lang.String[] argv)
           
 StringWrapper prepare(java.lang.String s)
          Preprocess a string by finding tokens and giving them appropriate weights
 double score(StringWrapper s, StringWrapper t)
          This method needs to be implemented by subclasses.
 void setMismatchFactor(double d)
           
 void setMismatchFactor(java.lang.Double d)
           
 java.lang.String toString()
           
 
Methods inherited from class com.wcohen.secondstring.AbstractStatisticalTokenDistance
accumulateStatistics, getDocumentFrequency
 
Methods inherited from class com.wcohen.secondstring.AbstractStringDistance
doMain, explainScore, score
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

TokenFelligiSunter

public TokenFelligiSunter(Tokenizer tokenizer,
                          double mismatchFactor)

TokenFelligiSunter

public TokenFelligiSunter()
Method Detail

setMismatchFactor

public void setMismatchFactor(double d)

setMismatchFactor

public void setMismatchFactor(java.lang.Double d)

score

public double score(StringWrapper s,
                    StringWrapper t)
Description copied from class: AbstractStringDistance
This method needs to be implemented by subclasses.

Specified by:
score in interface StringDistance
Specified by:
score in class AbstractStringDistance

prepare

public StringWrapper prepare(java.lang.String s)
Preprocess a string by finding tokens and giving them appropriate weights

Specified by:
prepare in interface StringDistance
Overrides:
prepare in class AbstractStringDistance

explainScore

public java.lang.String explainScore(StringWrapper s,
                                     StringWrapper t)
Explain how the distance was computed. In the output, the tokens in S and T are listed, and the common tokens are marked with an asterisk.

Specified by:
explainScore in interface StringDistance
Specified by:
explainScore in class AbstractStringDistance

toString

public java.lang.String toString()
Overrides:
toString in class java.lang.Object

main

public static void main(java.lang.String[] argv)