Skip to content

xynicole/MapReduce-Execution-on-GCP

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 

Repository files navigation

MapReduce-Execution-on-GCP

Create a Hadoop MapReduce application to find the maximum temperature in every day of the years 1901 and 1902. Your application should read the input from HDFS and store the output to HDFS. When your application completes, merge all the results to one file and store it on the local cluster.

1. Copy of Your mapper.py code (or equivalent in another programming language) (25% of total grade)

public static class MaxTempMapper
        extends Mapper<LongWritable, Text, Text, IntWritable> {
            private static final int MISSING = 9999;

            public void map(LongWritable key, Text value, Context context)
                throws IOException, InterruptedException {
                    
                String line = value.toString();
                String date = line.substring(15, 23);
                int temp;
                if (line.charAt(87) == '+') {
                    temp = Integer.parseInt(line.substring(88, 92));
                }
                else {
                    temp = Integer.parseInt(line.substring(87, 92));
                }
                String quality = line.substring(92, 93);

                if(temp != MISSING && quality.matches("[01459]")) {
                    context.write(new Text(date), new IntWritable(temp));
                }
            }
        }

2. Copy of Your reducer.py code (or equivalent in another programming language) (25% of total grade)

public static class MaxTempReducer 
            extends Reducer<Text, IntWritable, Text, IntWritable> {

            public void reduce(Text key, Iterable<IntWritable> values, Context context)
                    throws IOException, InterruptedException {
                
                int maxValue = Integer.MIN_VALUE;
                for (IntWritable value : values) {
                    maxValue = Math.max(maxValue, value.get());
                }
                context.write(key, new IntWritable(maxValue));
            }
        }

3. Screenshot of the execution of Hadoop MapReduce Job in the terminal (25% of total grade)

steps

alt text alt text

execution

alt text alt text

4. Copy of your output file (after merging) containing the results (25% of total grade)

result file

Extra Credit: ▪ (+20%) Build GUI to upload the two data files from your local machine to GCP bucket automatically without having

for my code, I removed my config json file and hide my bucket name

demo.mp4

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published