-
Notifications
You must be signed in to change notification settings - Fork 326
Open
Description
I want to compress some file already inside hdfs using different compression levels.
To do so, I write the following program:
Compress.java
import ...
import com.hadoop.compression.lzo.LzoCodec;
public class Compress {
public static class VoidReducer extends Reducer<LongWritable, Text, Text, Text> {
@Override
public void reduce(LongWritable key, Iterable<Text> values, Context context) throws IOException, InterruptedException {
for(Text value: values)
context.write(value, new Text(""));
}
}
public static void main(String[] args) throws Exception{
Configuration conf = new Configuration();
int level = Integer.parseInt(args[2]);
conf.setInt("io.compression.codec.lzo.compression.level", level);
Job job = Job.getInstance(conf);
job.setJobName("Compresser Job");
job.setJarByClass(Compress.class);
job.setMapperClass(Mapper.class);
job.setReducerClass(VoidReducer.class);
job.setNumReduceTasks(1);
TextInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
FileOutputFormat.setCompressOutput(job, true);
FileOutputFormat.setOutputCompressorClass(job, LzoCodec.class);
// submit and wait for completion
job.waitForCompletion(true);Then I execute run the following commands
$ javac -classpath $(hadoop classpath) *.java
$ jar -cvf Compress.jar Compress.class
$ hadoop jar Compress.jar Compress file.txt test1 1
$ hadoop jar Compress.jar Compress file.txt test7 7
The filefile.txt is of size 1Gb. When I then check the size of test1 and test2 with
hdfs dfs -du -s -h, I get 594.6 M for each.
This proves that the compression level is ignored.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels