Skip to content

Conversation

ilejn
Copy link
Collaborator

@ilejn ilejn commented Dec 17, 2024

Original PR:
ClickHouse#68674

Changelog Category:

  • Improvement

Changelog Entry:

Added a new set of metrics for Thread Pool introspection, providing deeper insights into thread pool performance and behavior.

Documentation Entry for User-Facing Changes

Several new metrics have been introduced to enhance monitoring and analysis of thread pools:

  • Obsolete Metrics: The LocalThread, LocalThreadActive, and LocalThreadScheduled metrics have been marked as obsolete. This change reflects recent architectural updates that reduce the relevance of local thread pools, as discussed in ClickHouse PR #47880. Users should adjust their monitoring configurations accordingly.

  • GlobalThreadPoolExpansions and Shrinks: These metrics track the frequency of expansions and contractions in the global thread pool, offering visibility into how the system adapts to varying workloads.

  • Thread Creation and Lock Wait Times: These metrics measure the time taken to start new threads and the time threads spend waiting for locks. These insights are crucial for identifying potential bottlenecks in thread management.

  • Job Wait Time: This metric captures the time elapsed from when a job is scheduled until it is executed, providing a clear view of the responsiveness of the thread pool.

Additionally, to prevent jobs from being delayed or stuck in the queue, the job queue within the ThreadPool has been made stable. This ensures that jobs with the same priority are processed in the order they were added, enabling fair and timely execution. While I don't have a test showing any issues caused by the previous queue behavior (non-FIFO), the problems with that were observed during experiments with modified code.

Key Metrics to Monitor

  • GlobalThreadPoolThreadCreationMicroseconds: Normally low, and even thousands of threads can created per second without delays. However, under certain conditions, thread creation can be slow, potentially due to kernel-level locks around memory maps (unconfirmed). When that metric increases is have cascading effect on lock waits & JobWaitTime, and underperforming thread pool.

  • GlobalThreadPoolExpansions and Shrinks: Ideally, these should not happen too often, indicating that the global pool size is well-tuned for the workload.

  • GlobalThreadPoolJobs and LocalThreadPoolJobs: Indicate the number of jobs processed by the thread pools, providing a sense of workload distribution.

  • LocalThreadPoolJobWaitTimeMicroseconds and GlobalThreadPoolJobWaitTimeMicroseconds: These metrics signal delays in the ThreadPool queue, which could be due to locks, out-of-order execution, cpu starvation, or other factors.

  • GlobalThreadPoolLockWaitMicroseconds and LocalThreadPoolLockWaitMicroseconds: These metrics allow for analysis of lock contention, with values proportional to the number of active threads waiting for the lock.

  • LocalThreadPoolBusyMicroseconds: When compared to LocalThreadPoolJobWaitTimeMicroseconds, this metric helps assess how efficiently the thread pool is operating, highlighting the balance between useful work and thread pool maintenance overhead.

  • LocalThreadPoolThreadCreationMicroseconds is actually time used to push the task into the GlobalThread pool (can be impacted by locks, and slow creation of the 'real' threads).

Note: The counter metrics will only be partially visible in the system.query_log under ProfileEvents. Specifically, only those metrics incremented in the ::scheduleImpl method will be logged. This is because counters incremented in the ::worker method occur before the thread status is initialized, making them visible only in the global context and not on a per-query basis.

For practical examples and test results, you can refer to the following test.

Sample Output

sum(ProfileEvent_GlobalThreadPoolExpansions):                 4132
sum(ProfileEvent_GlobalThreadPoolShrinks):                    3649
sum(ProfileEvent_GlobalThreadPoolThreadCreationMicroseconds): 15043940
sum(ProfileEvent_GlobalThreadPoolLockWaitMicroseconds):       21807528619
sum(ProfileEvent_GlobalThreadPoolJobs):                       84616
sum(ProfileEvent_GlobalThreadPoolJobWaitTimeMicroseconds):    1787436966
sum(ProfileEvent_LocalThreadPoolExpansions):                  40452
sum(ProfileEvent_LocalThreadPoolShrinks):                     40452
sum(ProfileEvent_LocalThreadPoolThreadCreationMicroseconds):  3895767384
sum(ProfileEvent_LocalThreadPoolLockWaitMicroseconds):        12982962008
sum(ProfileEvent_LocalThreadPoolJobs):                        42925
sum(ProfileEvent_LocalThreadPoolBusyMicroseconds):            16998330672
sum(ProfileEvent_LocalThreadPoolJobWaitTimeMicroseconds):     13590969274

In that synthetic test the pool efficiency was only 55%, 45% of time was just wasted on thread pool management.

SELECT
    toStartOfInterval(event_time, toIntervalMinute(10)) AS t,
    round((100 * sum(ProfileEvent_LocalThreadPoolBusyMicroseconds)) / (sum(ProfileEvent_LocalThreadPoolBusyMicroseconds) + sum(ProfileEvent_LocalThreadPoolJobWaitTimeMicroseconds)), 1) AS thread_pool_efficiency_percent
FROM system.metric_log
GROUP BY t
ORDER BY t ASC

Query id: 82d52a37-518b-4d34-b756-57f398613452

    ┌───────────────────t─┬─thread_pool_efficiency_percent─┐
 1. │ 2024-08-21 08:10:00 │                           93.4 │
 2. │ 2024-08-21 11:00:00 │                           94.1 │
 3. │ 2024-08-21 11:10:00 │                             86 │
 4. │ 2024-08-21 11:20:00 │                           74.4 │
 5. │ 2024-08-21 11:30:00 │                           45.9 │
 6. │ 2024-08-21 11:40:00 │                           68.6 │
 7. │ 2024-08-21 11:50:00 │                           95.6 │
 8. │ 2024-08-21 12:00:00 │                           43.6 │
 9. │ 2024-08-21 12:10:00 │                           40.9 │
10. │ 2024-08-21 12:20:00 │                           47.3 │

Another (indirect) signal of the problems is how many threads have actually task to do - just by looking on the thread names (threads which don't have an actual job are named ThreadPool):

    ┌─name────────────┬─count()─┐
 1. │ QueryPipelineEx │    2001 │
 2. │ ThreadPool      │    1777 │
 3. │ TCPHandler      │     552 │
 4. │ BgSchPool       │     512 │
 5. │ QueryPullPipeEx │     271 │
 6. │ MergeTreeIndex  │      64 │
 7. │ Fetch           │      16 │
 8. │ BgDistSchPool   │      16 │

Or by comparing the number of working threads, vs active threads:

┌───────────────now()─┬─metric────────────────┬─value─┬─description─────────────────────────────────────────────┐
1. │ 2024-08-21 16:44:19 │ Query                 │   544 │ Number of executing queries                             │
2. │ 2024-08-21 16:44:19 │ GlobalThread          │  4718 │ Number of threads in global thread pool.                │
3. │ 2024-08-21 16:44:19 │ GlobalThreadActive    │  3069 │ Number of threads in global thread pool running a task. │
4. │ 2024-08-21 16:44:19 │ GlobalThreadScheduled │  3224 │ Number of queued or active jobs in global thread pool.  │
   └─────────────────────┴───────────────────────┴───────┴─────────────────────────────────────────────────────────┘


quantiles(0.001, 0.5, 0.999)(CurrentMetric_GlobalThread):          [1323, 1836, 4891.508]
quantiles(0.001, 0.5, 0.999)(CurrentMetric_GlobalThreadActive):    [805.9999999999999, 1623, 3654.316]
quantiles(0.001, 0.5, 0.999)(CurrentMetric_GlobalThreadScheduled): [805.9999999999999, 1723, 3720.1919999999996]

- Introduced performance metrics for better monitoring and troubleshooting of ThreadPool.
…rity), otherwise some jobs can stay in queue untaken for a long time
@ilejn ilejn changed the title 24.8.4 Backport PR #68674 Thread pool metrics 24.8.8 Backport PR #68674 Thread pool metrics Dec 17, 2024
@altinity-robot
Copy link
Collaborator

altinity-robot commented Dec 17, 2024

This is an automated comment for commit 01838c6 with description of existing statuses. It's updated for the latest CI running

❌ Click here to open a full report in a separate page

Check nameDescriptionStatus
Integration testsThe integration tests report. In parenthesis the package type is given, and in square brackets are the optional part/total tests❌ failure
Sign aarch64There's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS❌ error
Sign releaseThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS❌ error
Stateful testsRuns stateful functional tests for ClickHouse binaries built in various configurations -- release, debug, with sanitizers, etc❌ error
Stateless testsRuns stateless functional tests for ClickHouse binaries built in various configurations -- release, debug, with sanitizers, etc❌ failure
Successful checks
Check nameDescriptionStatus
BuildsThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Compatibility checkChecks that clickhouse binary runs on distributions with old libc versions. If it fails, ask a maintainer for help✅ success
Docker keeper imageThe check to build and optionally push the mentioned image to docker hub✅ success
Docker server imageThe check to build and optionally push the mentioned image to docker hub✅ success
Install packagesChecks that the built packages are installable in a clear environment✅ success
Ready for releaseThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Stress testRuns stateless functional tests concurrently from several clients to detect concurrency-related errors✅ success

@Enmk Enmk merged commit 170bf37 into customizations/24.8.8 Dec 19, 2024
212 of 248 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants