com.wcohen.ss.expt
Class TokenBlocker
java.lang.Object
com.wcohen.ss.expt.Blocker
com.wcohen.ss.expt.TokenBlocker
- Direct Known Subclasses:
- ClusterTokenBlocker, NGramBlocker
public class TokenBlocker
- extends Blocker
Finds all pairs that share a not-too-common token.
Method Summary |
void |
block(MatchData data)
Load matchdata and prepare it for production of candidate pairs. |
double |
getMaxFraction()
|
Blocker.Pair |
getPair(int i)
Get the i-th candidate pair, as produced from most recently block()-ed data |
int |
numCorrectPairs()
Return total number of correct pairs in the dataset. |
void |
setMaxFraction(double maxFraction)
|
int |
size()
Return number of candidate pairs, as produced from most recently block()-ed data |
java.lang.String |
toString()
|
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
tokenizer
protected Tokenizer tokenizer
TokenBlocker
public TokenBlocker(Tokenizer tokenizer,
double maxFraction)
TokenBlocker
public TokenBlocker()
getMaxFraction
public double getMaxFraction()
setMaxFraction
public void setMaxFraction(double maxFraction)
block
public void block(MatchData data)
- Description copied from class:
Blocker
- Load matchdata and prepare it for production of candidate pairs.
- Specified by:
block
in class Blocker
size
public int size()
- Description copied from class:
Blocker
- Return number of candidate pairs, as produced from most recently block()-ed data
- Specified by:
size
in class Blocker
getPair
public Blocker.Pair getPair(int i)
- Description copied from class:
Blocker
- Get the i-th candidate pair, as produced from most recently block()-ed data
- Specified by:
getPair
in class Blocker
numCorrectPairs
public int numCorrectPairs()
- Description copied from class:
Blocker
- Return total number of correct pairs in the dataset.
- Specified by:
numCorrectPairs
in class Blocker
toString
public java.lang.String toString()
- Overrides:
toString
in class java.lang.Object