Skip to content
rohitjoshi edited this page Sep 9, 2012 · 16 revisions

lua-mapreduce is a fast and easy MapReduce implementation for lua inspired by other ma-reduce implementation and particularly octopy in python.

It doesn't aim to meet all your distributed computing needs, but its simple approach is amendable to a large proportion of parallelizable tasks. If your code has a for-loop, there's a good chance that you can make it distributed with just a few small changes.

It uses following lua modules.

  1. lausocket: tcp client-server connectivity
  2. copas: Coroutine Oriented Portable Asynchronous Services for Lua
  3. lualogging
  4. serialize(included in this project)
  5. luafilesystem : Only in the task-file example to read files from the disk. lua-mapreduce client/server doesn't depend on this module

Directory structure:

  1. lua-mapreduce-server.lua : It is a map-reduce server which receives the connections from clients, sends them task-file and than sends them tasks to perform map/reduce functionality.

  2. lua-mapreduce-client.lua : It connects to the server, receives the task and executes map/reduce functions defines in the task-file

  3. utils/utils.lua : Provides utility functionality

  4. utils/serialize.lua : Provides table serialization functionality

  5. example/task-file.lua : Example task-file for counting words from all .lua files in current directory More details on how to create task file is given in word-count example page of wiki.

Example:

  1. Start Server: lua-mapreduce-server.lua -t task-file.lua [-s server-ip -p port -l loglevel]

  2. Start Client: lua-mapreduce-client.lua [-s server-ip -p port -l loglevel]

Todo

  1. Add support to handled failed task. currently if client disconnect, the task handled by the client is lost
  2. Add more example of task-files
  3. Possibly integrate with apache-mesos
Clone this wiki locally