问题描述
来自 this stackoverflow thread,我知道如何在pyspark中获取和使用log4j logger,例如:
from pyspark import SparkContext sc = SparkContext() log4jLogger = sc._jvm.org.apache.log4j LOGGER = log4jLogger.LogManager.getLogger('MYLOGGER') LOGGER.info("pyspark script logger initialized")
与 spark-submit 脚本一起工作.
我的问题是如何修改 log4j.properties 文件以配置 此特定 logger的日志级别或如何动态配置它?
推荐答案
关于如何通过log4j.properties文件配置log4j的其他答案,但是我没有看到有人动态地提及如何进行操作,因此:
from pyspark import SparkContext sc = SparkContext() log4jLogger = sc._jvm.org.apache.log4j LOGGER = log4jLogger.LogManager.getLogger('MYLOGGER') # same call as you'd make in java, just using the py4j methods to do so LOGGER.setLevel(log4jLogger.Level.WARN) # will no longer print LOGGER.info("pyspark script logger initialized")
问题描述
From this StackOverflow thread, I know how to obtain and use the log4j logger in pyspark like so:
from pyspark import SparkContext sc = SparkContext() log4jLogger = sc._jvm.org.apache.log4j LOGGER = log4jLogger.LogManager.getLogger('MYLOGGER') LOGGER.info("pyspark script logger initialized")
Which works fine with the spark-submit script.
My question is how to modify the log4j.properties file to configure the log level for this particular logger or how to configure it dynamically?
推荐答案
There are other answers on how to configure log4j via the log4j.properties file, but I haven't seen anyone mention how to do it dynamically, so:
from pyspark import SparkContext sc = SparkContext() log4jLogger = sc._jvm.org.apache.log4j LOGGER = log4jLogger.LogManager.getLogger('MYLOGGER') # same call as you'd make in java, just using the py4j methods to do so LOGGER.setLevel(log4jLogger.Level.WARN) # will no longer print LOGGER.info("pyspark script logger initialized")