导读 1.oozie能干什么 2.怎么用oozie 3.怎么运行oozie
  
oozie概述: oozie是基于hadoop的调度器,以xml的形式写调度流程,可以调度mr,pig,hive,shell,jar等等。 主要的功能有 Workflow: 顺序执行流程节点,支持fork(分支多个节点),join(合并多个节点为一个)Coordinator,定时触发workflowBundle Job,绑定多个coordinator
oozie格式: 写一个oozie,有两个是必要的:job.properties 和 workflow.xml(coordinator.xml,bundle.xml) 一、job.properties里定义环境变量 nameNode | hdfs://xxx5:8020 | hdfs地址 | jobTracker | xxx5:8034 | jobTracker地址 | queueName | default | oozie队列 | examplesRoot | examples | 全局目录 | oozie.usr.system.libpath | true | 是否加载用户lib库 | oozie.libpath | share/lib/user | 用户lib库 | oozie.wf.appication.path | ${nameNode}/user/${user.name}/... | oozie流程所在hdfs地址 |
注意: workflow:oozie.wf.application.path coordinator:oozie.coord.application.path bundle:oozie.bundle.application.path
二、XML 1.workflow:
Xml代码
[XML] 纯文本查看 复制代码 <workflow-app xmlns="uri:oozie:workflow:0.2" name="wf-example1">
<start to="pig-node">
<action name="pig-node">
<pig>
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<prepare>
<delete path="hdfs://xxx5/user/hadoop/appresult" />
</prepare>
<configuration>
<property>
<name>mapred.job.queue.name</name>
<value>default</value>
<property>
<property>
<name>mapred.compress.map.output</name>
<value>true</value>
<property>
<property>
<name>mapreduce.fileoutputcommitter.marksuccessfuljobs</name>
<value>false</value>
<property>
</configuration>
<script>test.pig</script>
<param>filepath=${filpath}</param>
</pig>
<ok to="end">
<error to="fail">
</action>
<kill name="fail">
<message>
Map/Reduce failed, error message[${wf:errorMessage(wf:lastErrorNode())}]
</message>
</kill>
<end name="end"/>
</workflow-app>
2.coordinator
Xml代码 [XML] 纯文本查看 复制代码 <coordinator-app name="cron-coord" frequence="${coord:hours(6)}" start="${start}" end="${end}"
timezoe="UTC" xmlns="uri:oozie:coordinator:0.2">
<action>
<workflow>
<app-path>${nameNode}/user/{$coord:user()}/${examplesRoot}/wpath</app-path>
<configuration>
<property>
<name>jobTracker</name>
<value>${jobTracker}</value>
</property>
<property>
<name>nameNode</name>
<value>${nameNode}</value>
</property>
<property>
<name>queueName</name>
<value>${queueName}</value>
</property>
</configuration>
</workflow>
</action> 注意:coordinator设置的UTC,比北京时间晚8个小时,所以你要是把期望执行时间减8小时 coordinator里面传值给workflow,example,时间设置为亚洲 XML
[XML] 纯文本查看 复制代码 <coordinator-app name="gwk-hour-log-coord" frequency="${coord:hours(1)}" start="${hourStart}" end="${hourEnd}" timezone="Asia/Shanghai"
xmlns="uri:oozie:coordinator:0.2">
<action>
<workflow>
<app-path>${workflowHourLogAppUri}/gwk-workflow.xml</app-path>
<configuration>
<property>
<name>yyyymmddhh</name>
<value>${coord:formatTime(coord:dateOffset(coord:nominalTime(),-1,'HOUR'), 'yyyyMMddHH')}</value>
</property>
</configuration>
</workflow>
</action>
</coordinator-app> 代码
3.bundle
Java代码
[XML] 纯文本查看 复制代码 <bundle-app name='APPNAME' xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance' xmlns='uri:oozie:bundle:0.1'>
<controls>
<kick-off-time>${kickOffTime}</kick-off-time>
</controls>
<coordinator name='coordJobFromBundle1' >
<app-path>${appPath}</app-path>
<configuration>
<property>
<name>startTime1</name>
<value>${START_TIME}</value>
</property>
<property>
<name>endTime1</name>
<value>${END_TIME}</value>
</property>
</configuration>
</coordinator>
<coordinator name='coordJobFromBundle2' >
<app-path>${appPath2}</app-path>
<configuration>
<property>
<name>startTime2</name>
<value>${START_TIME2}</value>
</property>
<property>
<name>endTime2</name>
<value>${END_TIME2}</value>
</property>
</configuration>
</coordinator>
</bundle-app>
oozie hive Java代码
[XML] 纯文本查看 复制代码 <action name="hive-app">
<hive xmlns="uri:oozie:hive-action:0.2">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<job-xml>hive-site.xml</job-xml>
<script>hivescript.q</script>
<param>yyyymmdd=${yyyymmdd}</param>
<param>yesterday=${yesterday}</param>
<param>lastmonth=${lastmonth}</param>
</hive>
<ok to="result-stat-join"/>
<error to="fail"/>
</action>
oozie运行 启动任务: Java代码
- oozie job -oozie http://xxx5:11000/oozie -config job.properties -run
停止任务: oozie job -oozie http://localhost:8080/oozie -kill 14-20090525161321-oozie-joe注意:在停止任务的时候,有的时候会出现全线问题,需要修改oozie-site.xml文件 hadoop.proxyuser.oozie.groups * hadoop.proxyuser.oozie.hosts * oozie.server.ProxyUserServer.proxyuser.hadoop.hosts * oozie.server.ProxyUserServer.proxyuser.hadoop.groups *
以上所有东西虽然已经使用过了,但是内容都是手打的,若有笔误,请见谅
|