如何在单元测试中压制Spark日志?[英] How to suppress Spark logging in unit tests?

本文是小编为大家收集整理的关于如何在单元测试中压制Spark日志?的处理/解决方法,可以参考本文帮助大家快速定位并解决问题,中文翻译不准确的可切换到English标签页查看源文。

问题描述

所以感谢我尝试的轻松谷歌博客:

import org.specs2.mutable.Specification

class SparkEngineSpecs extends Specification {
  sequential

  def setLogLevels(level: Level, loggers: Seq[String]): Map[String, Level] = loggers.map(loggerName => {
    val logger = Logger.getLogger(loggerName)
    val prevLevel = logger.getLevel
    logger.setLevel(level)
    loggerName -> prevLevel
  }).toMap

  setLogLevels(Level.WARN, Seq("spark", "org.eclipse.jetty", "akka"))

  val sc = new SparkContext(new SparkConf().setMaster("local").setAppName("Test Spark Engine"))

  // ... my unit tests

但不幸的是,我仍然会得到很多火花输出,例如:

14/12/02 12:01:56 INFO MemoryStore: Block broadcast_4 of size 4184 dropped from memory (free 583461216)
14/12/02 12:01:56 INFO ContextCleaner: Cleaned broadcast 4
14/12/02 12:01:56 INFO ContextCleaner: Cleaned shuffle 4
14/12/02 12:01:56 INFO ShuffleBlockManager: Deleted all files for shuffle 4

推荐答案

在src/test/resources dir内的log4j.properties文件中添加以下代码,创建文件/dir(如果不存在)

# Change this to set Spark log level
log4j.logger.org.apache.spark=WARN

# Silence akka remoting
log4j.logger.Remoting=WARN

# Ignore messages below warning level from Jetty, because it's a bit verbose
log4j.logger.org.eclipse.jetty=WARN

当我运行单元测试(我正在使用junit和maven)时,我只会收到警告级日志,换句话说,不再使用信息级日志混乱(尽管有时可以对调试很有用).

我希望这会有所帮助.

其他推荐答案

在我的情况下,我自己的一个库将木制经典带入了混音中.这是在一开始就在警告中实现的:

SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/alex/.ivy2/cache/ch.qos.logback/logback-classic/jars/logback-classic-1.1.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/alex/.ivy2/cache/org.slf4j/slf4j-log4j12/jars/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]

我通过将其排除在依赖项之外来解决:

"com.mystuff" % "mylib" % "1.0.0" exclude("ch.qos.logback", "logback-classic")

现在,我可以在test/resources中添加log4j.properties文件,该文件现在被Spark使用.

其他推荐答案

在与火花日志输出进行了一段时间之后,我找到了博客文章带有我特别喜欢的解决方案.

如果您使用SLF4J,则可以简单地交换基础日志实现. SLF4J-NOP是测试范围的好罐子,它刻有其日志输出,并将其放置在太阳永远不会发光的地方.

使用Maven时,您可以将以下内容添加到依赖项列表的顶部:

<dependency>
  <groupId>org.slf4j</groupId>
  <artifactId>slf4j-api</artifactId>
  <version>1.7.12</version>
  <scope>provided</scope>
</dependency>

<dependency>
  <groupId>org.slf4j</groupId>
  <artifactId>slf4j-nop</artifactId>
  <version>1.7.12</version>
  <scope>test</scope>
</dependency>

请注意,将其放在依赖项列表的开头可能很重要,以确保使用给定的实现而不是可能随附其他软件包的实现(并且您可以考虑将其排除在外以保持您的班级路径整理并避免意外冲突).

本文地址:https://www.itbaoku.cn/post/1574794.html

问题描述

So thanks to easily googleable blogs I tried:

import org.specs2.mutable.Specification

class SparkEngineSpecs extends Specification {
  sequential

  def setLogLevels(level: Level, loggers: Seq[String]): Map[String, Level] = loggers.map(loggerName => {
    val logger = Logger.getLogger(loggerName)
    val prevLevel = logger.getLevel
    logger.setLevel(level)
    loggerName -> prevLevel
  }).toMap

  setLogLevels(Level.WARN, Seq("spark", "org.eclipse.jetty", "akka"))

  val sc = new SparkContext(new SparkConf().setMaster("local").setAppName("Test Spark Engine"))

  // ... my unit tests

But unfortunately it doesn't work, I still get a lot of spark output, e.g.:

14/12/02 12:01:56 INFO MemoryStore: Block broadcast_4 of size 4184 dropped from memory (free 583461216)
14/12/02 12:01:56 INFO ContextCleaner: Cleaned broadcast 4
14/12/02 12:01:56 INFO ContextCleaner: Cleaned shuffle 4
14/12/02 12:01:56 INFO ShuffleBlockManager: Deleted all files for shuffle 4

推荐答案

Add the following code into the log4j.properties file inside the src/test/resources dir, create the file/dir if not exist

# Change this to set Spark log level
log4j.logger.org.apache.spark=WARN

# Silence akka remoting
log4j.logger.Remoting=WARN

# Ignore messages below warning level from Jetty, because it's a bit verbose
log4j.logger.org.eclipse.jetty=WARN

When I run my unit tests (I'm using JUnit and Maven), I only receive WARN level logs, in other words no more cluttering with INFO level logs (though they can be useful at times for debugging).

I hope this helps.

其他推荐答案

In my case one of my own libraries brought logback-classic into the mix. This materialized in a warning at the start:

SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/alex/.ivy2/cache/ch.qos.logback/logback-classic/jars/logback-classic-1.1.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/alex/.ivy2/cache/org.slf4j/slf4j-log4j12/jars/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]

I solved this by excluding it from the dependency:

"com.mystuff" % "mylib" % "1.0.0" exclude("ch.qos.logback", "logback-classic")

Now I could add a log4j.properties file in test/resources which now gets used by Spark.

其他推荐答案

After some time of struggling with Spark log output as well, I found a blog post with a solution I particularly liked.

If you use slf4j, one can simply exchange the underlying log implementation. A good canidate for the test scope is slf4j-nop, which carfully takes the log output and puts it where the sun never shines.

When using Maven you can add the following to the top of your dependencies list:

<dependency>
  <groupId>org.slf4j</groupId>
  <artifactId>slf4j-api</artifactId>
  <version>1.7.12</version>
  <scope>provided</scope>
</dependency>

<dependency>
  <groupId>org.slf4j</groupId>
  <artifactId>slf4j-nop</artifactId>
  <version>1.7.12</version>
  <scope>test</scope>
</dependency>

Note that it might be important to have it at the beginning of the dependencies list to make sure that the given implementations are used instead of those that might come with other packages (and which you can consider to exclude in order to keep your class path tidy and avoid unexpected conflicts).