为什么我的应用程序级别的日志在OOZIE中执行的时候会消失?[英] Why do my application level logs disappear when executed in oozie?

本文是小编为大家收集整理的关于为什么我的应用程序级别的日志在OOZIE中执行的时候会消失?的处理/解决方法,可以参考本文帮助大家快速定位并解决问题,中文翻译不准确的可切换到English标签页查看源文。

问题描述

我在CDH5环境中使用了Oozie.我还在使用Oozie Web-Console.我无法从应用程序中看到任何日志.我可以看到Hadoop日志,火花日志等;但是我看到没有特定应用程序的日志.

在我的应用程序中,我包括src/main/resources/log4j.properties

# Root logger option
log4j.rootLogger=INFO, stdout

# Direct log messages to stdout
log4j.appender.stdout=org.apache.log4j.ConsoleAppender
log4j.appender.stdout.Target=System.out
log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
log4j.appender.stdout.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss} %-5p %c{1}:%L - %m%n

在我的Oozie工作流中,我有Java-actions和Spark-actions.

也必须注意,当我从命令行运行应用程序时,我确实会看到我的应用程序级别日志.

推荐答案

Oozie在不同的"启动器"作业中运行每个动作 - 实际上是带有单个mapper 的纱线作业(请参阅下面的例外).

每当您在表格job_000000000_0000中看到"外部ID",那么您就可以到达application_000000_0000> 的纱线日志(是的(是的," job"是Hadoop 1的遗产命名约定,仍然由Job History使用服务,但纱线还有另一个命名约定).

您的应用程序输出实际上是倾倒到该Oozie"启动器"

的纱线日志中
  • 您的stderr被抛弃,可以在" stderr"部分中检索
  • 您的Stdout在每行上都有一个前缀(Oozie都使用前缀来管理其<capture_output/>用于外壳和猪动作的技巧)在Atrocely docbose" stdout"末端的
  • ,没有什么都没有进入" syslog"部分afaik

底线:

  1. 运行oozie job -info ******获取操作列表和oozie工作流执行的相应"外部ID"
  2. 对于每个job_*****_**旧ID,运行yarn logs -applicationId application_*****_** | more以浏览全局纱线日志,然后缩放您的特定应用程序logs
  3. 现在您可以尝试自动化那件事...有乐趣 B-)


"启动器" Oozie作业原理的例外 - 电子邮件操作/文件系统操作只是直接从Oozie Server Process执行的API调用; MapReduce Action与多个映射器和还原器一起产生了定期的纱线工作.

本文地址:https://www.itbaoku.cn/post/1574764.html

问题描述

I'm using oozie in CDH5 environment. I'm also using the oozie web-console. I'm not able to see any of the logs from my application. I can see hadoop logs, spark logs, etc; but I see no application specific logs.

In my application I've included src/main/resources/log4j.properties

# Root logger option
log4j.rootLogger=INFO, stdout

# Direct log messages to stdout
log4j.appender.stdout=org.apache.log4j.ConsoleAppender
log4j.appender.stdout.Target=System.out
log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
log4j.appender.stdout.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss} %-5p %c{1}:%L - %m%n

In my oozie workflow I have java-actions and spark-actions.

It is also important to note that when I run my application from the command line I do see my application level logs.

推荐答案

Oozie runs each Action in a different "launcher" job -- actually a YARN job with a single mapper (see exceptions below).

Whenever you see an "external ID" in the form job_000000000_0000 then you can reach the YARN logs for application_000000_0000 (yeah, "job" is the legacy naming convention from Hadoop 1, still used by JobHistory service, but YARN has another naming convention).

Your application output is actually dumped into the YARN logs for that Oozie "launcher"

  • your StdErr is dumped as-is and can be retrieved in the "stderr" section
  • your StdOut is dumped with a prefix on each line (that prefix is used by Oozie to manage its <capture_output/> trick for Shell and Pig actions) at the end of the atrocely verbose "stdout" section
  • and nothing gets into the "syslog" section AFAIK

Bottom line:

  1. run oozie job -info ****** to get the list of Actions and the corresponding "external IDs" for your Oozie workflow execution
  2. for each job_*****_** legacy ID, run yarn logs -applicationId application_*****_** | more to skim the global YARN logs, then zoom on your specific app logs
  3. now you can try to automate that thing... have fun           B-)


Exceptions to the "launcher" Oozie job principle -- the E-mail Action / Filesystem Action are just API calls executed directly from the Oozie server process; and the MapReduce Action spawns a regular YARN job with multiple Mappers and Reducers.