Skip to content

Commit ba81b92

Browse files
attilapirosdongjoon-hyun
authored andcommitted
[SPARK-36719][CORE] Supporting Netty Logging at the network layer
### What changes were proposed in this pull request? Supporting Netty level logging at the network layer. To configure Netty level logging a `LogHandler` must be added to the channel pipeline. In this PR I have introduced a new class `NettyLogger` which is able to construct a log handler depending on the log level: - in case of `log4j.logger.org.apache.spark.network.util.NettyLogger=DEBUG`: a custom log handler is created which does not dump the message contents. This way the log is a bit more compact. Moreover when network level encryption is switched on this level might be sufficient. - in case of `log4j.logger.org.apache.spark.network.util.NettyLogger=TRACE`: Netty's own log handler is used which dumps the message contents. - otherwise (when the logger is not TRACE or DEBUG) the pipeline does not contain a log handler (there is no runtime penalty for the default setting but a long running app/service must be restarted along with the new log level to have an effect). ### Why are the changes needed? This level of logging proved to be sufficient during debugging some external shuffle related problem. Compared with the tcpdump this log lines can be more easily correlated with the Spark internal calls. Moreover the log layout can be configured to contain the thread names that way for a timeout a busy thread could be identified. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Manually. #### DEBUG level ``` ╭─attilazsoltpirosapiros-MBP16 ~/git/attilapiros/spark ‹SPARK-36719*› ╰─$ tail -1 ./conf/log4j.properties log4j.logger.org.apache.spark.network.util.NettyLogger=DEBUG ╭─attilazsoltpirosapiros-MBP16 ~/git/attilapiros/spark ‹SPARK-36719*› ╰─$ ./bin/spark-submit --class org.apache.spark.examples.JavaWordCount --master local\[8\] ./examples/target/original-spark-examples_2.12-3.3.0-SNAPSHOT.jar README.md 2> >(grep NettyLogger) 1> /dev/null 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0xb9d94fcf] REGISTERED 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0xb9d94fcf] CONNECT: /172.30.64.219:61014 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 - R:/172.30.64.219:61014] ACTIVE 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0x28101520, L:/172.30.64.219:61014 - R:/172.30.64.219:61015] REGISTERED 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0x28101520, L:/172.30.64.219:61014 - R:/172.30.64.219:61015] ACTIVE 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 - R:/172.30.64.219:61014] WRITE 66B 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 - R:/172.30.64.219:61014] FLUSH 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0x28101520, L:/172.30.64.219:61014 - R:/172.30.64.219:61015] READ 66B 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0x28101520, L:/172.30.64.219:61014 - R:/172.30.64.219:61015] WRITE: MessageWithHeader [headerLength: 74, bodyLength: 1552705] 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0x28101520, L:/172.30.64.219:61014 - R:/172.30.64.219:61015] FLUSH 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 - R:/172.30.64.219:61014] READ 74B 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0x28101520, L:/172.30.64.219:61014 - R:/172.30.64.219:61015] READ COMPLETE 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 - R:/172.30.64.219:61014] READ COMPLETE 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 - R:/172.30.64.219:61014] READ 2048B 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 - R:/172.30.64.219:61014] READ 32768B 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 - R:/172.30.64.219:61014] READ 65536B 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 - R:/172.30.64.219:61014] READ 65536B 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 - R:/172.30.64.219:61014] READ 65536B 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 - R:/172.30.64.219:61014] READ 65536B 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 - R:/172.30.64.219:61014] READ 65536B 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 - R:/172.30.64.219:61014] READ 65536B 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 - R:/172.30.64.219:61014] READ 65536B 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 - R:/172.30.64.219:61014] READ 65536B 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 - R:/172.30.64.219:61014] READ 65536B 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 - R:/172.30.64.219:61014] READ 65536B 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 - R:/172.30.64.219:61014] READ 65536B 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 - R:/172.30.64.219:61014] READ 65536B 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 - R:/172.30.64.219:61014] READ 65536B 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 - R:/172.30.64.219:61014] READ 65536B 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 - R:/172.30.64.219:61014] READ COMPLETE 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 - R:/172.30.64.219:61014] READ 65536B 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 - R:/172.30.64.219:61014] READ 65536B 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 - R:/172.30.64.219:61014] READ 65536B 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 - R:/172.30.64.219:61014] READ 65536B 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 - R:/172.30.64.219:61014] READ 65536B 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 - R:/172.30.64.219:61014] READ 65536B 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 - R:/172.30.64.219:61014] READ 65536B 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 - R:/172.30.64.219:61014] READ 65536B 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 - R:/172.30.64.219:61014] READ 65536B 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 - R:/172.30.64.219:61014] READ 10561B 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 - R:/172.30.64.219:61014] READ COMPLETE 21/09/10 15:24:40 DEBUG NettyLogger: [id: 0x28101520, L:/172.30.64.219:61014 ! R:/172.30.64.219:61015] INACTIVE 21/09/10 15:24:40 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 - R:/172.30.64.219:61014] READ COMPLETE 21/09/10 15:24:40 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 ! R:/172.30.64.219:61014] INACTIVE 21/09/10 15:24:40 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 ! R:/172.30.64.219:61014] UNREGISTERED 21/09/10 15:24:40 DEBUG NettyLogger: [id: 0x28101520, L:/172.30.64.219:61014 ! R:/172.30.64.219:61015] UNREGISTERED ``` #### TRACE level ``` ╭─attilazsoltpirosapiros-MBP16 ~/git/attilapiros/spark ‹SPARK-36719*› ╰─$ tail -1 ./conf/log4j.properties log4j.logger.org.apache.spark.network.util.NettyLogger=TRACE ╭─attilazsoltpirosapiros-MBP16 ~/git/attilapiros/spark ‹SPARK-36719*› ╰─$ ./bin/spark-submit --class org.apache.spark.examples.JavaWordCount --master local\[8\] ./examples/target/original-spark-examples_2.12-3.3.0-SNAPSHOT.jar README.md 1> /dev/null 2>&1 ... 21/09/10 15:29:14 TRACE NettyLogger: [id: 0xf1d25786] REGISTERED 21/09/10 15:29:14 TRACE NettyLogger: [id: 0xf1d25786] CONNECT: /172.30.64.219:61044 21/09/10 15:29:14 TRACE NettyLogger: [id: 0xf1d25786, L:/172.30.64.219:61045 - R:/172.30.64.219:61044] ACTIVE 21/09/10 15:29:14 INFO TransportClientFactory: Successfully created connection to /172.30.64.219:61044 after 37 ms (0 ms spent in bootstraps) 21/09/10 15:29:14 TRACE NettyLogger: [id: 0x362fc693, L:/172.30.64.219:61044 - R:/172.30.64.219:61045] REGISTERED 21/09/10 15:29:14 TRACE NettyLogger: [id: 0x362fc693, L:/172.30.64.219:61044 - R:/172.30.64.219:61045] ACTIVE 21/09/10 15:29:14 INFO Utils: Fetching spark://172.30.64.219:61044/jars/original-spark-examples_2.12-3.3.0-SNAPSHOT.jar to /private/var/folders/t_/fr_vqcyx23vftk81ftz1k5hw0000gn/T/spark-91e059f5-1e29-4727-8602-f81206bbe48b/userFiles-50b48490-8950-4c46-b3d3-61a2c85412a3/fetchFileTemp8803030587223485061.tmp 21/09/10 15:29:14 TRACE NettyLogger: [id: 0xf1d25786, L:/172.30.64.219:61045 - R:/172.30.64.219:61044] WRITE: 66B +-------------------------------------------------+ | 0 1 2 3 4 5 6 7 8 9 a b c d e f | +--------+-------------------------------------------------+----------------+ |00000000| 00 00 00 00 00 00 00 42 06 00 00 00 35 2f 6a 61 |.......B....5/ja| |00000010| 72 73 2f 6f 72 69 67 69 6e 61 6c 2d 73 70 61 72 |rs/original-spar| |00000020| 6b 2d 65 78 61 6d 70 6c 65 73 5f 32 2e 31 32 2d |k-examples_2.12-| |00000030| 33 2e 33 2e 30 2d 53 4e 41 50 53 48 4f 54 2e 6a |3.3.0-SNAPSHOT.j| |00000040| 61 72 |ar | +--------+-------------------------------------------------+----------------+ 21/09/10 15:29:14 TRACE NettyLogger: [id: 0xf1d25786, L:/172.30.64.219:61045 - R:/172.30.64.219:61044] FLUSH 21/09/10 15:29:14 TRACE NettyLogger: [id: 0x362fc693, L:/172.30.64.219:61044 - R:/172.30.64.219:61045] READ: 66B +-------------------------------------------------+ | 0 1 2 3 4 5 6 7 8 9 a b c d e f | +--------+-------------------------------------------------+----------------+ |00000000| 00 00 00 00 00 00 00 42 06 00 00 00 35 2f 6a 61 |.......B....5/ja| |00000010| 72 73 2f 6f 72 69 67 69 6e 61 6c 2d 73 70 61 72 |rs/original-spar| |00000020| 6b 2d 65 78 61 6d 70 6c 65 73 5f 32 2e 31 32 2d |k-examples_2.12-| |00000030| 33 2e 33 2e 30 2d 53 4e 41 50 53 48 4f 54 2e 6a |3.3.0-SNAPSHOT.j| |00000040| 61 72 |ar | +--------+-------------------------------------------------+----------------+ 21/09/10 15:29:14 TRACE NettyLogger: [id: 0x362fc693, L:/172.30.64.219:61044 - R:/172.30.64.219:61045] WRITE: MessageWithHeader [headerLength: 74, bodyLength: 1552705] 21/09/10 15:29:14 TRACE NettyLogger: [id: 0x362fc693, L:/172.30.64.219:61044 - R:/172.30.64.219:61045] FLUSH 21/09/10 15:29:14 TRACE NettyLogger: [id: 0xf1d25786, L:/172.30.64.219:61045 - R:/172.30.64.219:61044] READ: 74B ... ``` Closes #33962 from attilapiros/SPARK-36719. Authored-by: attilapiros <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
1 parent ebca01f commit ba81b92

File tree

3 files changed

+78
-1
lines changed

3 files changed

+78
-1
lines changed

common/network-common/src/main/java/org/apache/spark/network/TransportContext.java

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -44,6 +44,7 @@
4444
import org.apache.spark.network.server.TransportServerBootstrap;
4545
import org.apache.spark.network.util.IOMode;
4646
import org.apache.spark.network.util.NettyUtils;
47+
import org.apache.spark.network.util.NettyLogger;
4748
import org.apache.spark.network.util.TransportConf;
4849
import org.apache.spark.network.util.TransportFrameDecoder;
4950

@@ -64,6 +65,7 @@
6465
public class TransportContext implements Closeable {
6566
private static final Logger logger = LoggerFactory.getLogger(TransportContext.class);
6667

68+
private static final NettyLogger nettyLogger = new NettyLogger();
6769
private final TransportConf conf;
6870
private final RpcHandler rpcHandler;
6971
private final boolean closeIdleConnections;
@@ -187,7 +189,11 @@ public TransportChannelHandler initializePipeline(
187189
RpcHandler channelRpcHandler) {
188190
try {
189191
TransportChannelHandler channelHandler = createChannelHandler(channel, channelRpcHandler);
190-
ChannelPipeline pipeline = channel.pipeline()
192+
ChannelPipeline pipeline = channel.pipeline();
193+
if (nettyLogger.getLoggingHandler() != null) {
194+
pipeline.addLast("loggingHandler", nettyLogger.getLoggingHandler());
195+
}
196+
pipeline
191197
.addLast("encoder", ENCODER)
192198
.addLast(TransportFrameDecoder.HANDLER_NAME, NettyUtils.createFrameDecoder())
193199
.addLast("decoder", DECODER)

common/network-common/src/main/java/org/apache/spark/network/protocol/MessageWithHeader.java

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -194,4 +194,10 @@ public boolean release(int decrement) {
194194
}
195195
return super.release(decrement);
196196
}
197+
198+
@Override
199+
public String toString() {
200+
return "MessageWithHeader [headerLength: " + headerLength + ", bodyLength: " + bodyLength + "]";
201+
}
202+
197203
}
Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,65 @@
1+
/*
2+
* Licensed to the Apache Software Foundation (ASF) under one or more
3+
* contributor license agreements. See the NOTICE file distributed with
4+
* this work for additional information regarding copyright ownership.
5+
* The ASF licenses this file to You under the Apache License, Version 2.0
6+
* (the "License"); you may not use this file except in compliance with
7+
* the License. You may obtain a copy of the License at
8+
*
9+
* http://www.apache.org/licenses/LICENSE-2.0
10+
*
11+
* Unless required by applicable law or agreed to in writing, software
12+
* distributed under the License is distributed on an "AS IS" BASIS,
13+
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
14+
* See the License for the specific language governing permissions and
15+
* limitations under the License.
16+
*/
17+
18+
package org.apache.spark.network.util;
19+
20+
import io.netty.buffer.ByteBuf;
21+
import io.netty.buffer.ByteBufHolder;
22+
import io.netty.channel.ChannelHandlerContext;
23+
import io.netty.handler.logging.LoggingHandler;
24+
import io.netty.handler.logging.LogLevel;
25+
import org.slf4j.Logger;
26+
import org.slf4j.LoggerFactory;
27+
28+
public class NettyLogger {
29+
private static final Logger logger = LoggerFactory.getLogger(NettyLogger.class);
30+
31+
/** A Netty LoggingHandler which does not dump the message contents. */
32+
private static class NoContentLoggingHandler extends LoggingHandler {
33+
34+
NoContentLoggingHandler(Class<?> clazz, LogLevel level) {
35+
super(clazz, level);
36+
}
37+
38+
protected String format(ChannelHandlerContext ctx, String eventName, Object arg) {
39+
if (arg instanceof ByteBuf) {
40+
return format(ctx, eventName) + " " + ((ByteBuf) arg).readableBytes() + "B";
41+
} else if (arg instanceof ByteBufHolder) {
42+
return format(ctx, eventName) + " " +
43+
((ByteBufHolder) arg).content().readableBytes() + "B";
44+
} else {
45+
return super.format(ctx, eventName, arg);
46+
}
47+
}
48+
}
49+
50+
private final LoggingHandler loggingHandler;
51+
52+
public NettyLogger() {
53+
if (logger.isTraceEnabled()) {
54+
loggingHandler = new LoggingHandler(NettyLogger.class, LogLevel.TRACE);
55+
} else if (logger.isDebugEnabled()) {
56+
loggingHandler = new NoContentLoggingHandler(NettyLogger.class, LogLevel.DEBUG);
57+
} else {
58+
loggingHandler = null;
59+
}
60+
}
61+
62+
public LoggingHandler getLoggingHandler() {
63+
return loggingHandler;
64+
}
65+
}

0 commit comments

Comments
 (0)