问题描述
我正在尝试在我的火花工作中使用log4j 2 记录器.基本要求:log4j2配置位于ClassPath之外,因此我需要明确指定其位置.当我直接在IDE中运行代码而不使用spark-submit时, log4j2 效果很好.但是,当我使用spark-submit提交相同的代码以激发群集时,它无法找到log42配置,并且落后于默认的旧log4j.
启动器命令
${SPARK_HOME}/bin/spark-submit \ --class my.app.JobDriver \ --verbose \ --master 'local[*]' \ --files "log4j2.xml" \ --conf spark.executor.extraJavaOptions="-Dlog4j.configurationFile=log4j2.xml" \ --conf spark.driver.extraJavaOptions="-Dlog4j.configurationFile=log4j2.xml" \ myapp-SNAPSHOT.jar
maven中的log4j2依赖项
<dependencies> . . . <dependency> <groupId>org.apache.logging.log4j</groupId> <artifactId>log4j-core</artifactId> <version>${log4j2.version}</version> </dependency> <dependency> <groupId>org.apache.logging.log4j</groupId> <artifactId>log4j-api</artifactId> <version>${log4j2.version}</version> </dependency> <!-- Bridge log4j to log4j2 --> <dependency> <groupId>org.apache.logging.log4j</groupId> <artifactId>log4j-1.2-api</artifactId> <version>${log4j2.version}</version> </dependency> <!-- Bridge slf4j to log4j2 --> <dependency> <groupId>org.apache.logging.log4j</groupId> <artifactId>log4j-slf4j-impl</artifactId> <version>${log4j2.version}</version> </dependency> <dependencies>
有什么想法我会想念什么?
推荐答案
显然目前还没有Spark中的Log4J2官方支持官员.这是有关该主题的详细讨论:/spark-6305
在实际方面,意味着:
-
如果您可以访问Spark配置和罐子并可以修改它们,则在手动将Log4J2 Jars添加到Spark_ClassPath之后,您仍然可以使用log4j2,并提供log4j2配置文件以spark.
-
如果您在托管火花群集上运行并且无法访问Spark Jars/Config,则您仍然可以使用log4j2,但是它的使用将仅限于在驱动程序端执行的代码.执行者运行的任何代码零件都将使用Spark执行者Logger(是旧的log4j)
其他推荐答案
Spark返回到log4j,因为它可能无法在启动过程中初始化记录系统(您的应用程序代码未添加到ClassPath中).
如果允许您在群集节点上放置新文件,则在所有文件上创建目录(例如/opt/spark_extras),请将所有log4j2 jars放在那里,然后添加两个配置选项以spark-subly:
--conf spark.executor.extraClassPath=/opt/spark_extras/* --conf spark.driver.extraClassPath=/opt/spark_extras/*
然后将库添加到ClassPath.
如果您无法在群集上修改文件,则可以尝试其他方法.使用--jars添加所有log4j2 jars到Spark-Submit参数.根据文档添加到驾驶员和执行者的类路径中,以便以相同的方式工作.
其他推荐答案
尝试使用-driver-java-options
${SPARK_HOME}/bin/spark-submit \ --class my.app.JobDriver \ --verbose \ --master 'local[*]' \ --files "log4j2.xml" \ --driver-java-options "-Dlog4j.configuration=log4j2.xml" \ --jars log4j-api-2.8.jar,log4j-core-2.8.jar,log4j-1.2-api-2.8.jar \ myapp-SNAPSHOT.jar
问题描述
I'm trying to use log4j2 logger in my Spark job. Essential requirement: log4j2 config is located outside classpath, so I need to specify its location explicitly. When I run my code directly within IDE without using spark-submit, log4j2 works well. However when I submit the same code to Spark cluster using spark-submit, it fails to find log42 configuration and falls back to default old log4j.
Launcher command
${SPARK_HOME}/bin/spark-submit \ --class my.app.JobDriver \ --verbose \ --master 'local[*]' \ --files "log4j2.xml" \ --conf spark.executor.extraJavaOptions="-Dlog4j.configurationFile=log4j2.xml" \ --conf spark.driver.extraJavaOptions="-Dlog4j.configurationFile=log4j2.xml" \ myapp-SNAPSHOT.jar
Log4j2 dependencies in maven
<dependencies> . . . <dependency> <groupId>org.apache.logging.log4j</groupId> <artifactId>log4j-core</artifactId> <version>${log4j2.version}</version> </dependency> <dependency> <groupId>org.apache.logging.log4j</groupId> <artifactId>log4j-api</artifactId> <version>${log4j2.version}</version> </dependency> <!-- Bridge log4j to log4j2 --> <dependency> <groupId>org.apache.logging.log4j</groupId> <artifactId>log4j-1.2-api</artifactId> <version>${log4j2.version}</version> </dependency> <!-- Bridge slf4j to log4j2 --> <dependency> <groupId>org.apache.logging.log4j</groupId> <artifactId>log4j-slf4j-impl</artifactId> <version>${log4j2.version}</version> </dependency> <dependencies>
Any ideas what I could miss?
推荐答案
Apparently at the moment there is no official support official for log4j2 in Spark. Here is detailed discussion on the subject: https://issues.apache.org/jira/browse/SPARK-6305
On practical side that means:
If you have access to Spark configs and jars and can modify them, you still can use log4j2 after manually adding log4j2 jars to SPARK_CLASSPATH, and providing log4j2 configuration file to Spark.
If you run on managed Spark cluster and have no access to Spark jars/configs, then you still can use log4j2, however its use will be limited to the code executed at driver side. Any code part running by executors will use Spark executors logger (which is old log4j)
其他推荐答案
Spark falls back to log4j because it probably cannot initialize logging system during startup (your application code is not added to classpath).
If you are permitted to place new files on your cluster nodes then create directory on all of them (for example /opt/spark_extras), place there all log4j2 jars and add two configuration options to spark-submit:
--conf spark.executor.extraClassPath=/opt/spark_extras/* --conf spark.driver.extraClassPath=/opt/spark_extras/*
Then libraries will be added to classpath.
If you have no access to modify files on cluster you can try another approach. Add all log4j2 jars to spark-submit parameters using --jars. According to the documentation all these libries will be added to driver's and executor's classpath so it should work in the same way.
其他推荐答案
Try using the --driver-java-options
${SPARK_HOME}/bin/spark-submit \ --class my.app.JobDriver \ --verbose \ --master 'local[*]' \ --files "log4j2.xml" \ --driver-java-options "-Dlog4j.configuration=log4j2.xml" \ --jars log4j-api-2.8.jar,log4j-core-2.8.jar,log4j-1.2-api-2.8.jar \ myapp-SNAPSHOT.jar