-
Notifications
You must be signed in to change notification settings - Fork 117
Code enhancement: Replaced explicit synchronized access to a hashmap with a concurrent map. #392
Code enhancement: Replaced explicit synchronized access to a hashmap with a concurrent map. #392
Conversation
|
LGTM. |
kimoonkim
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks for the cleanup.
|
Thanks! This is not a bug-fix correct? If so, please hold off on merge till we cut the 0.3.0 for the 2.2 branch (just to keep 0.3.0 in sync with the 2.1 branch). |
mccheah
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this leaves room for a synchronization bug, but feel free to mention if I'm mistaken here.
| hostToLocalTaskCount | ||
| } | ||
| for (pod <- executorPodsWithIPs) { | ||
| for ((_, pod) <- executorPodsByIPs) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem we need to be careful of here is modifications in between each iteration of the for loop. Recall that concurrent hash map only protects concurrent access to one key at a time. But here we care about the whole state; that is, we want the entire map to be consistent throughout the iteration, but we lose that guarantee if we don't lock around the entire iteration cycle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per Javadocs at https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ConcurrentHashMap.html
The following in my understanding.
ConcurrentHashMap is fully interoperable with Hashtable in programs that rely on its thread safety but not on its synchronization details. When you iterate, you are guaranteed that you get access to the all the entries in a thread-safe manner. Effectively, you are iterating on a snapshot of the Map's contents (and what you end up reading is dependent on concurrent modifications on the Map by other threads). Multiple threads can iterate on the HashMap in a thread-safe manner too as long as each thread has it own copy of the iterator.
The method here only seems to require, thread-safe access to the map, and it seems to me the change should be safe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the concern from @mccheah is valid if we care about the consistency of the map throughout the entire iterating. Iterating through a ConcurrentHashMap may or may not reflect changes made to the map after the iterator is created, although iteration is thread-safe and won't throw the ConcurrentModificationException. In this particular case, it means we may or may not lose changes made while the map is being iterated through. Also it is mentioned in the Javadoc that "iterators are designed to be used by only one thread at a time."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My change is based on the understanding that we don't need the whole map to be consistent during the iteration the Map. The previous lock there was for thread-safety not for exclusive lock down of the map for access.
"iterators are designed to be used by only one thread at a time."
==> means a single iterator cannot be shared across multiple threads safely. Multiple threads can safely iterate as long as each thread has its own copy of an iterator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...
But here we care about the whole state; that is, we want the entire map to be consistent throughout the iteration, but we lose that guarantee if we don't lock around the entire iteration cycle.
....
@mccheah - I don't think this is necessary. Confirmed it with @kimoonkim, who is the original author for this method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Discovering executor pods is done only as a best effort. No need for consistency.
|
This is not a bug fix. Addressed a TODO for a general code enhancement. This code is not directly related to any specific feature work. |
mccheah
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changing to approved but would like @kimoonkim to sign off.
| private val EXECUTOR_PODS_BY_IPS_LOCK = new Object | ||
| // Indexed by executor IP addrs and guarded by EXECUTOR_PODS_BY_IPS_LOCK | ||
| private val executorPodsByIPs = new mutable.HashMap[String, Pod] | ||
| private val executorPodsByIPs: concurrent.Map[String, Pod] = new |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would you like to specify a type of the variable after it? Would it be cleaner without it? i.e.
val executorPodsByIPs = new ConcurrentHashMap[String, Pod]().asScala
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The trait concurrent.Map[K,V] is the correct interface/trait for the variable type, but it is abstract and can't be instantiated directly, so in this case typing the variable makes sense. Using scala's concurrent.TrieMap is another option, although the choice is probably arbitrary.
val executorPodsByIPs: concurrent.Map[String, Pod] = concurrent.TrieMap.empty[String, Pod]https://www.scala-lang.org/api/2.11.8/index.html#scala.collection.concurrent.TrieMap$
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@erikerlandson thank you for an explanation. That makes sense for me, though when I typed the line in scala console(2.11.8), the compiler was able to inference concurrent.Map[K, V] class.
scala> val executorPodsByIPs = new ConcurrentHashMap[String, String]().asScala
executorPodsByIPs: scala.collection.concurrent.Map[String,String] = Map()
I looked into the docs, and found that .asScala method going to do a conversion from java.util.ConcurrentHashMap[K, V] to scala.concurrent.Map[K, V]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've looked into databrick scala recomendations, they suggest to avoid concurrent.Map and use j.u.c.ConcurrentHashMap.
Prefer java.util.concurrent.ConcurrentHashMap over scala.collection.concurrent.Map
https://github.com/databricks/scala-style-guide#concurrency
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@satybald good point re: style recommendations and SI-7943
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@satybald Thanks for bringing this up. Please note that SI-7943 doesn't really affect the thread correctness of our code. It is specific to only the TrieMap implementation. We are already using the java.util.concurrent.ConcurrentHashMap underneath and using the scala.concurrent.Map trait on top for access.
We should not use the trait nevertheless per the community recommendation to avoid any future maintenance or merging of it upstream.
|
It seems this PR has stalled a bit since the last activity a few weeks ago. @varunkatta what's next here? |
|
Just back to this PR...I thought this PR was merged long back ; apparently not. I will address the last few comments today. |
|
Going to bring in after #459 |
|
@varunkatta this is ready for rebase -- please fix merge conflicts |
…with a concurrent map. (apache-spark-on-k8s#392) * Replaced explicit synchronized access to hashmap with a concurrent map * Removed usages of scala.collection.concurrent.Map
…with a concurrent map. (apache-spark-on-k8s#392) * Replaced explicit synchronized access to hashmap with a concurrent map * Removed usages of scala.collection.concurrent.Map
What changes were proposed in this pull request?
Refactored access to a HashMap through an explicit external lock with a implicit internal lock. Thread safe access to the map is maintained.
How was this patch tested?
Ran Unit-tests