com.wcohen.ss.expt
Class TokenBlocker

java.lang.Object
  extended by com.wcohen.ss.expt.Blocker
      extended by com.wcohen.ss.expt.TokenBlocker
Direct Known Subclasses:
ClusterTokenBlocker, NGramBlocker

public class TokenBlocker
extends Blocker

Finds all pairs that share a not-too-common token.


Nested Class Summary
 
Nested classes/interfaces inherited from class com.wcohen.ss.expt.Blocker
Blocker.Pair
 
Field Summary
protected  Tokenizer tokenizer
           
 
Fields inherited from class com.wcohen.ss.expt.Blocker
clusterMode
 
Constructor Summary
TokenBlocker()
           
TokenBlocker(Tokenizer tokenizer, double maxFraction)
           
 
Method Summary
 void block(MatchData data)
          Load matchdata and prepare it for production of candidate pairs.
 double getMaxFraction()
           
 Blocker.Pair getPair(int i)
          Get the i-th candidate pair, as produced from most recently block()-ed data
 int numCorrectPairs()
          Return total number of correct pairs in the dataset.
 void setMaxFraction(double maxFraction)
           
 int size()
          Return number of candidate pairs, as produced from most recently block()-ed data
 java.lang.String toString()
           
 
Methods inherited from class com.wcohen.ss.expt.Blocker
countCorrectPairs, setClusterMode, setClusterMode
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

tokenizer

protected Tokenizer tokenizer
Constructor Detail

TokenBlocker

public TokenBlocker(Tokenizer tokenizer,
                    double maxFraction)

TokenBlocker

public TokenBlocker()
Method Detail

getMaxFraction

public double getMaxFraction()

setMaxFraction

public void setMaxFraction(double maxFraction)

block

public void block(MatchData data)
Description copied from class: Blocker
Load matchdata and prepare it for production of candidate pairs.

Specified by:
block in class Blocker

size

public int size()
Description copied from class: Blocker
Return number of candidate pairs, as produced from most recently block()-ed data

Specified by:
size in class Blocker

getPair

public Blocker.Pair getPair(int i)
Description copied from class: Blocker
Get the i-th candidate pair, as produced from most recently block()-ed data

Specified by:
getPair in class Blocker

numCorrectPairs

public int numCorrectPairs()
Description copied from class: Blocker
Return total number of correct pairs in the dataset.

Specified by:
numCorrectPairs in class Blocker

toString

public java.lang.String toString()
Overrides:
toString in class java.lang.Object