@@ -1807,7 +1807,7 @@ To run a Spark Streaming applications, you need to have the following.
18071807 + * Mesos* - [ Marathon] ( https://github.com/mesosphere/marathon ) has been used to achieve this
18081808 with Mesos.
18091809
1810- - * [ Since Spark 1.2 ] Configuring write ahead logs* - Since Spark 1.2,
1810+ - * Configuring write ahead logs* - Since Spark 1.2,
18111811 we have introduced _ write ahead logs_ for achieving strong
18121812 fault-tolerance guarantees. If enabled, all the data received from a receiver gets written into
18131813 a write ahead log in the configuration checkpoint directory. This prevents data loss on driver
@@ -1822,6 +1822,17 @@ To run a Spark Streaming applications, you need to have the following.
18221822 stored in a replicated storage system. This can be done by setting the storage level for the
18231823 input stream to ` StorageLevel.MEMORY_AND_DISK_SER ` .
18241824
1825+ - * Setting the max receiving rate* - If the cluster resources is not large enough for the streaming
1826+ application to process data as fast as it is being received, the receivers can be rate limited
1827+ by setting a maximum rate limit in terms of records / sec.
1828+ See the [ configuration parameters] ( configuration.html#spark-streaming )
1829+ ` spark.streaming.receiver.maxRate ` for receivers and ` spark.streaming.kafka.maxRatePerPartition `
1830+ for Direct Kafka approach. In Spark 1.5, we have introduced a feature called * backpressure* that
1831+ eliminate the need to set this rate limit, as Spark Streaming automatically figures out the
1832+ rate limits and dynamically adjusts them if the processing conditions change. This backpressure
1833+ can be enabled by setting the [ configuration parameter] ( configuration.html#spark-streaming )
1834+ ` spark.streaming.backpressure.enabled ` to ` true ` .
1835+
18251836### Upgrading Application Code
18261837{:.no_toc}
18271838
0 commit comments