Skip to content

Commit 4f1d261

Browse files
committed
Fix for newFilesOnly logic in file DStream
The newFilesOnly logic should be inverted: if newFilesOnly==true then only start reading files older than current time. As the code is now if newFilesOnly==true then it will start to read files that are older than 0L (that is: every file in the directory).
1 parent 70c8116 commit 4f1d261

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

streaming/src/main/scala/org/apache/spark/streaming/dstream/FileInputDStream.scala

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,7 @@ class FileInputDStream[K: ClassTag, V: ClassTag, F <: NewInputFormat[K,V] : Clas
4545
// Files with mod time earlier than this is ignored. This is updated every interval
4646
// such that in the current interval, files older than any file found in the
4747
// previous interval will be ignored. Obviously this time keeps moving forward.
48-
private var ignoreTime = if (newFilesOnly) 0L else System.currentTimeMillis()
48+
private var ignoreTime = if (newFilesOnly) System.currentTimeMillis() else 0L
4949

5050
// Latest file mod time seen till any point of time
5151
@transient private var path_ : Path = null

0 commit comments

Comments
 (0)