java - Running a standalone Hadoop application on multiple CPU cores -
My team created a Java application using the Hadoop libraries to turn the bunch of my output files into useful output. Looking at the current load, a single multicore server will work fine for the next year or so. We do not have to go for a multi-level Hadoop cluster (yet), yet we decided to start this project "being prepared".
When I run this application in command-line (or in eclipse) or Netbins I still have not been able to reduce it to a map and / or thread at a time to convince it. Given the fact that the tool is very CPU intensive, this "single threadedness" is my current obstacle.
When it is run in the Netbeans Profiler, I find that the app starts many threads for different purposes, but only one map is running at least one moment.
Input data has many input files, so Hadop should be able to run per input 1 thread at the same time for the mapping phase.
What can I do to run at least 2 or 4 active threads (which is possible for this application's most processing time)?
I hope that I am very silly that I have ignored.
I've found this: It applies the feature I was looking at in Hadoop 0.21. This flag introduces Magdruus. Local.map.tasks.maximum Control it.
I still got a description of the solution.
I'm not sure I am right, but when you are working in local mode, your Can not have multiple mappers / reducers.
However, the configuration option mapred.tasktracker.map.tasks.maximum
and Mapred.tasktracker.reduce.tasks.maximum to run maximum mappers and set reducers
By default, those options are set to 2
, so I can be right.
Finally, if you want to go straight to the multi-purpose cluster with running it in a completely distributed way, but all the servers running on the same machine (Named Node, Detanode, Tasktracker, JobTracker , ...)
Comments
Post a Comment