-
Notifications
You must be signed in to change notification settings - Fork 74
Description
Version: 4.0.3
ML Version: 9.03-1 and 9.04
Java Version: Oracle 9.0.1
OS: Windows 10.1
Input: Passing a list of 1,000 or more URIs to QueryBatcher (can also just DHF quickstart - online store example against a cluster with 16 threads at batches of 50 to see the behavior)
Actual output: Java logs indicate WAITING on multiple threads, initial calls go out as expected, using the full threadcount available to the threadpoolexecutor, but subsequent calls fail to pick up the additional threads resulting in many (usually more than half) being left in a wait state. ML access logs (and app server thread counts) mirror this assessment.
Expected output: When 12 threads are specified, for 12 threads to generally be used until there is nothing left for the threads to grab, and the thread usage dwindles down at the very end with no more batches to run.
Alternatives: I've modified the QueryThreadPoolExecutor constructor on line 796 of QueryBatcherImpl with varying capacities (as well as watching the startIterating and run() methods in conjunction) and it presents the hypothesis that the available queue size (currently set to threadCount * 5) is far too small for the queryBatcher's needs and is spending more time waiting for the queue (rejectedExecutions) to be available than processing the batches.