java - custom word count using hadoop -
I have begun in a houp I have understood the wordcast program. Now I have a problem. I do not want to produce all the words ..
- word_i_want.txt -
hello
echo
raj
- Text.txt -
Hello Avon I want hello and count count
should be output Hello 2
echo 1) Raj 0
Now it was an Explete, my actual data is too big
In
, output to each token term from mapper
input value Land and No. 1:
while (tokenizer.hasMoreTokens ()) {word.set (tokenizer.nextToken ()); Output coat (word, one); }
If you want to count only a few words, do not you want the output word from your Mapper
that matches your list?
while (tokenizer.hasMoreTokens ()) {string token = tokenizer.nextToken (); If (wordsettersairabout.contentions (token)) {word.set (token); Output coat (word, one); }}
Comments
Post a Comment