FastNetMon

Tuesday, 20 October 2009

map reduce лего и просто?

Прошу ссылку на сию прелесть: http://discoproject.org/index.html

А вот пример кода на нем:

from disco.core import Disco, result_iterator

def fun_map(e, params):
return [(w, 1) for w in e.split()]

def fun_reduce(iter, out, params):
s = {}
for w, f in iter:
s[w] = s.get(w, 0) + int(f)
for w, f in s.iteritems():
out.add(w, f)

results = Disco("disco://localhost").new_job(
name = "wordcount",
input = ["http://discoproject.org/chekhov.txt"],
map = fun_map,
reduce = fun_reduce).wait()

for word, frequency in result_iterator(results):
print word, frequency

This is a fully working Disco script that computes word frequencies in a text corpus. Disco distributes the script automatically to a cluster, so it can utilize all available CPUs in parallel. For details, see Disco tutorial.


No comments :

Post a Comment

Note: only a member of this blog may post a comment.