Skip to content

Commit 03d3ad4

Browse files
Rohan Bhanderirxin
authored andcommitted
Fix typo "Received" to "Receiver" in streaming-kafka-integration.md
Removed typo on line 8 in markdown : "Received" -> "Receiver" Author: Rohan Bhanderi <[email protected]> Closes #9242 from RohanBhanderi/patch-1. (cherry picked from commit 16dc9f3) Signed-off-by: Reynold Xin <[email protected]>
1 parent be3e343 commit 03d3ad4

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

docs/streaming-kafka-integration.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ title: Spark Streaming + Kafka Integration Guide
55
[Apache Kafka](http://kafka.apache.org/) is publish-subscribe messaging rethought as a distributed, partitioned, replicated commit log service. Here we explain how to configure Spark Streaming to receive data from Kafka. There are two approaches to this - the old approach using Receivers and Kafka's high-level API, and a new experimental approach (introduced in Spark 1.3) without using Receivers. They have different programming models, performance characteristics, and semantics guarantees, so read on for more details.
66

77
## Approach 1: Receiver-based Approach
8-
This approach uses a Receiver to receive the data. The Received is implemented using the Kafka high-level consumer API. As with all receivers, the data received from Kafka through a Receiver is stored in Spark executors, and then jobs launched by Spark Streaming processes the data.
8+
This approach uses a Receiver to receive the data. The Receiver is implemented using the Kafka high-level consumer API. As with all receivers, the data received from Kafka through a Receiver is stored in Spark executors, and then jobs launched by Spark Streaming processes the data.
99

1010
However, under default configuration, this approach can lose data under failures (see [receiver reliability](streaming-programming-guide.html#receiver-reliability). To ensure zero-data loss, you have to additionally enable Write Ahead Logs in Spark Streaming (introduced in Spark 1.2). This synchronously saves all the received Kafka data into write ahead logs on a distributed file system (e.g HDFS), so that all the data can be recovered on failure. See [Deploying section](streaming-programming-guide.html#deploying-applications) in the streaming programming guide for more details on Write Ahead Logs.
1111

0 commit comments

Comments
 (0)