Recommended mechanism for submitting or managing map-reduce or Hadoop jobs is by using oozie workflow. It is integrated with the rest of the Hadoop stack supporting Hadoop MapReduce jobs. We can use oozie rest endpoint URL to submit (execute) a MapReduce job. Offcourse we can use Yarn (RM) API(s) to manage our submitted application from any other channel and perform activities like status, kill etc.
- hdfs dfs -mkdir -p /user/root/examples/apps/mapreduce/lib
- hdfs dfs -mkdir -p /user/root/examples/input-data/mapreduce (specified by mapred.input.dir)
- hdfs dfs -mkdir -p /user/root/examples/output-data/mapreduce (specified by mapred.output.dir)
user.name
root
jobTracker
sandbox-hdp.hortonworks.com:8032
oozie.wf.application.path
/user/root/examples/apps/mapreduce
queueName
default
nameNode
hdfs://sandbox-hdp.hortonworks.com:8020
applicationName
testoozie
NOTE :
element in Oozie referred here is used to pass information of the RM and not really represent old job tracker.Specs still call it jobTracker though it can serve both JT or RM depending on Hadoop version you are using.
Here value provided for jobTracker is resource-manager URL. I am using the reference name from HDP sandbox here. Another value is nameNode URL here again referred from HDP sandbox for representation.
Step 3:
${jobTracker}
${nameNode}
mapred.job.name
map-reduce-wf
mapred.mapper.new-api
true
mapred.reducer.new-api
true
mapred.job.queue.name
${queueName}
mapreduce.map.class
org.apache.hadoop.examples.WordCount$TokenizerMapper
mapreduce.reduce.class
org.apache.hadoop.examples.WordCount$TokenizerMapper
mapreduce.reduce.class
org.apache.hadoop.examples.WordCount$IntSumReducer
mapreduce.combine.class
org.apache.hadoop.examples.WordCount$IntSumReducer
mapred.output.key.class
org.apache.hadoop.io.Text
mapred.output.value.class
org.apache.hadoop.io.IntWritable
mapred.input.dir
/user/root/examples/input-data/mapreduce
mapred.output.dir
/user/root/examples/output-data/mapreduce
Map/Reduce failed, error message[${wf:errorMessage(wf:lastErrorNode())}]
Step 4: Create sampledata.txt
test
here
insight
failure
nomore
enough
Step 5: Copy the desired libraries to designated directory structure within HDFS using following commands. Please change the structure as appropriate for your application.
Copy application jar file in this case hadoop mapreduce example
hdfs dfs -put hadoop-mapreduce-examples.jar /user/root/examples/apps/mapreduce/lib
Copy created workflow xml file based on sample given
hdfs dfs -put workflow.xml /user/root/examples/apps/mapreduce/
Copy Sample Data for input
hdfs dfs -put sampledata.txt /user/root/examples/input-data/mapreduce
Step 6: Run
Here is how you can call it via curl. Off-course can use same from within your program based on the programming language you are using.
curl -i -s -X POST -H “Content-Type: application/xml” -T oozieconfig.xml http://sandbox-hdp.hortonworks.com:11000/oozie/v1/jobs?action=start
The command returns a JSON response that is similar to
HTTP/1.1 100 Continue
HTTP/1.1 201 Created
Server: Apache-Coyote/1.1
Content-Type: application/json;charset=UTF-8
Content-Length: 45
Date: Mon, 21 Jan 2019 07:19:03 GMT
{“id”:“0000008-190119071046177-oozie-oozi-W”}
You need to Be sure to record the job ID value i.e. {“id”:“0000008-190119071046177-oozie-oozi-W”}
Step 8: Status
Here is an example of using curl to retrieve the status of the workflow
curl -i -s -X GET -H \"Content-Type: application/xml\" -T oozieconfig.xml http://sandbox-hdp.hortonworks.com:11000/oozie/v1/job/\"0000008-190119071046177-oozie-oozi-W\"
HTTP/1.1 100 Continue
HTTP/1.1 200 OK
Server: Apache-Coyote/1.1
Content-Type: application/json;charset=UTF-8
Content-Length: 8114
Date: Mon, 21 Jan 2019 17:09:56 GMT
You can also check the job status from the Ambari console, select YARN and click Quick Links > Resource Manager UI. Select the job ID that matches the previous step result and view the job details.
For additional details you can refer oozie specification guidelines.
-Ritesh
It's Really A Great Post. Looking For Some More Stuff. shriram break free
LikeLike
Helpful and good post. Thankyou for sharing Sobha Royal Pavilion Sarjapur Road
LikeLike
The post was really good. Thanks for sharingshriram earth plots
LikeLike
Information provided by you is very helpful and informative. Keep On updating such information.prestige elysian
LikeLike
Nice Blog, When i was read this blog i learnt new things & its truly have well stuff related to developing technology, Thank you for sharing this blog.Microsoft Azure Training in Chennai | Azure Training in Chennai
LikeLike
I really enjoyed your blog Thanks for sharing such an informative post.Looking For Some More Stuff.best seo company in bangalore SSS digital Marketing
LikeLike
I really enjoyed your blog Thanks for sharing such an informative post.Looking For Some More Stuff. shuttering works
LikeLike
This comment has been removed by the author.
LikeLike
BA Exam Result – BA 1st Year, 2nd Year and 3rd Year ResultBsc Exam Result – Bsc 1st Year, 2nd Year and 3rd Year Result
LikeLike
Thanks for sharing is so amazing and helpful to us.Buy Hydrocodone online
LikeLike
Really appreciate this wonderful post that you have provided for us.Great site and a great topic as well I really get amazed to read this. It's really good.I like viewing web sites which comprehend the price of delivering the excellent useful resource free of charge. I truly adored reading your posting. Thank you!.mobile phone repair in Fredericksburgiphone repair in Fredericksburgcell phone repair in Fredericksburgphone repair in Fredericksburgtablet repair in Fredericksburgmobile phone repair in Fredericksburgmobile phone repair Fredericksburgiphone repair Fredericksburgcell phone repair Fredericksburgphone repair Fredericksburg
LikeLike
Hey! I wish your working with QuickBooks goes superb. If now not, then dial at Quickbooks Customer Service Number.
LikeLike
Big data is a term that describes the large volume of data – both structured and unstructured – that inundates a business on a day-to-day basis. big data projects for students But it’s not the amount of data that’s important.Project Center in Chennai Spring Framework has already made serious inroads as an integrated technology stack for building user-facing applications. Corporate TRaining Spring Framework the authors explore the idea of using Java in Big Data platforms. Spring Training in Chennai The new Angular TRaining will lay the foundation you need to specialise in Single Page Application developer. Angular Training Project Centers in Chennai
LikeLike