-
Notifications
You must be signed in to change notification settings - Fork 255
Rewrite ConnectionPool for efficiency #328
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rewrite ConnectionPool for efficiency #328
Conversation
|
This PR is related to #326. |
|
|
||
| private | ||
| def cache_enabled? | ||
| idle_timeout && idle_timeout > 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we're back to either disable the pooling entirely, or to have to prune connection?
I don't know if I'm alone, but in our case a full deploy take about 4 to 5 minutes and we do it from inside the datacenter, so pruning idle connections is really not a concern.
Any way we could disable it without disabling pooling?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My goal was to improve efficiency across the board, rather than add conditionals as workarounds. The point is that If you just set idle_timeout to a high value, you shouldn't pay any performance penalty, because a mutex is no longer being used for pruning.
But, like I said, the proof will be in the numbers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, we'll see. I didn't have time to test yet.
|
👍 |
|
Thanks for the review! It is nice to see that SSHKit is no longer a hot spot in the profile output, even if this doesn't make much difference in terms of wall time. There are two skipped unit tests that I need to fix before this can be merged, as well as edits to the CHANGELOG. I'll try to get those wrapped up this week. |
e76e556 to
69665a9
Compare
69665a9 to
9dd807c
Compare
|
I've rebased and squash this to a single commit, added a detailed commit message, and updated CHANGELOG and README. @leehambley This is ready for final review and merge. |
9dd807c to
72cf0b4
Compare
|
@leehambley This has been rebased and is ready to merge. |
This is a rewrite of SSHKit's ConnectionPool. The advantages of this version
are:
* Cleaner API
* Less mutex locking
* Asynchronous connection eviction/closing
* Faster (slightly)
Cleaner API
===========
# Old API
entry = pool.checkout(host, username, &Net::SSH.method(:start))
begin
# do stuff with entry.connection
ensure
pool.checkin(entry)
end
# New API
pool.with(Net::SSH.method(:start), host, username) do |connection|
# do stuff with connection
end
Less mutex locking
==================
Previously, every checkin and checkout would use a mutex to lock the entire pool
every time. Now, each unique connection key has its own separate mutex, so any
locking is done per host, and not for the entire pool.
Asynchronous connection eviction/closing
========================================
Previously, every checkin and checkout would trigger an expiration check for all
connections in the pool. Thus every thread must potentially wait for connections
used by other threads to be cleaned up and closed.
Now this logic is removed. Eviction happens in a separate background thread
periodically sweeps the connections to identify and close stale ones.
Faster (slightly)
=================
Profile before rewrite:
TOTAL (pct) SAMPLES (pct) FRAME
54 (12.0%) 54 (12.0%) SSHKit::Backend::ConnectionPool::Entry#expired?
50 (11.1%) 50 (11.1%) Net::SSH::Compat.io_select
28 (6.2%) 28 (6.2%) block (2 levels) in Net::SSH::Connection::Channel#forward_local_env
11 (2.4%) 11 (2.4%) Net::SSH::Transport::HMAC::Abstract.mac_length
11 (2.4%) 10 (2.2%) Net::SSH::Buffer#read
22 (4.9%) 10 (2.2%) Net::SSH::BufferedIo#fill
9 (2.0%) 9 (2.0%) Net::SSH::Transport::State#update_next_iv
12 (2.7%) 9 (2.0%) Net::SSH::BufferedIo#send_pending
63 (14.0%) 8 (1.8%) block (2 levels) in SSHKit::Backend::ConnectionPool#prune_expired
8 (1.8%) 8 (1.8%) Net::SSH::Buffer#initialize
7 (1.6%) 7 (1.6%) Net::SSH::Buffer#append
After rewrite:
TOTAL (pct) SAMPLES (pct) FRAME
383 (11.7%) 383 (11.7%) block (2 levels) in Net::SSH::Connection::Channel#forward_local_env
331 (10.2%) 331 (10.2%) Net::SSH::Compat.io_select
116 (3.6%) 102 (3.1%) Net::SSH::Buffer#read
100 (3.1%) 100 (3.1%) Net::SSH::Transport::HMAC::Abstract.digest_class
100 (3.1%) 100 (3.1%) Net::SSH::Transport::HMAC::Abstract.mac_length
133 (4.1%) 80 (2.5%) Net::SSH::BufferedIo#send_pending
139 (4.3%) 74 (2.3%) Net::SSH::BufferedIo#fill
71 (2.2%) 71 (2.2%) Net::SSH::Transport::State#update_next_iv
80 (2.5%) 71 (2.2%) SSHKit::Runner::Parallel#execute
57 (1.7%) 57 (1.7%) Net::SSH::Buffer#initialize
55 (1.7%) 55 (1.7%) Net::SSH::BufferedIo#output
Note that ConnectionPool is no longer listed.
72cf0b4 to
6de8614
Compare
Rewrite ConnectionPool for efficiency
|
Thanks @mattbrictson |
## [1.11.3][] (2016-09-16)
* Fix known_hosts caching to match on the entire hostlist
[PR #364](capistrano/sshkit#364) @byroot
## [1.11.2][] (2016-07-29)
### Bug fixes
* Fixed a crash occurring when `Host@keys` was set to a non-Enumerable.
@xavierholt [PR #360](capistrano/sshkit#360)
## [1.11.1][] (2016-06-17)
### Bug fixes
* Fixed a regression in 1.11.0 that would cause
`ArgumentError: invalid option(s): known_hosts` in some older versions of
net-ssh. @byroot [#357](capistrano/sshkit#357)
## [1.11.0][] (2016-06-14)
### Bug fixes
* Fixed colorized output alignment in Logger::Pretty. @xavierholt
[PR #349](capistrano/sshkit#349)
* Fixed a bug that prevented nested `with` calls
[#43](capistrano/sshkit#43)
### Other changes
* Known hosts lookup optimization is now enabled by default. @byroot
## 1.10.0 (2016-04-22)
* You can now opt-in to caching of SSH's known_hosts file for a speed boost
when deploying to a large fleet of servers. Refer to the
[README](https://github.com/capistrano/sshkit/tree/v1.10.0#known-hosts-caching) for
details. We plan to turn this on by default in a future version of SSHKit.
[PR #330](capistrano/sshkit#330) @byroot
* SSHKit now explicitly closes its pooled SSH connections when Ruby exits;
this fixes `zlib(finalizer): the stream was freed prematurely` warnings
[PR #343](capistrano/sshkit#343) @mattbrictson
* Allow command map entries (`SSHKit::CommandMap#[]`) to be Procs
[PR #310](capistrano/sshkit#310)
@mikz
## 1.9.0
**Refer to the 1.9.0.rc1 release notes for a full list of new features, fixes,
and potentially breaking changes since SSHKit 1.8.1.** There are no changes
since 1.9.0.rc1.
## 1.9.0.rc1
### Potentially breaking changes
* The SSHKit DSL is no longer automatically included when you `require` it.
**This means you must now explicitly `include SSHKit::DSL`.**
See [PR #219](capistrano/sshkit#219) for details.
@beatrichartz
* `SSHKit::Backend::Printer#test` now always returns true
[PR #312](capistrano/sshkit#312) @mikz
### New features
* `SSHKit::Formatter::Abstract` now accepts an optional Hash of options
[PR #308](capistrano/sshkit#308) @mattbrictson
* Add `SSHKit::Backend.current` so that Capistrano plugin authors can refactor
helper methods and still have easy access to the currently-executing Backend
without having to use global variables.
* Add `SSHKit.config.default_runner` options that allows to override default command runner.
This option also accepts a name of the custom runner class.
* The ConnectionPool has been rewritten in this release to be more efficient
and have a cleaner internal API. You can still completely disable the pool
by setting `SSHKit::Backend::Netssh.pool.idle_timeout = 0`.
@mattbrictson @byroot [PR #328](capistrano/sshkit#328)
### Bug fixes
* make sure working directory for commands is properly cleared after `within` blocks
[PR #307](capistrano/sshkit#307)
@steved
* display more accurate string for commands with spaces being output in `Formatter::Pretty`
[PR #304](capistrano/sshkit#304)
@steved
[PR #319](capistrano/sshkit#319) @mattbrictson
* Fix a race condition experienced in JRuby that could cause multi-server
deploys to fail. [PR #322](capistrano/sshkit#322)
@mattbrictson

This is a rewrite of SSHKit's ConnectionPool. The advantages of this version are:
Cleaner API
Less mutex locking
Previously, every
checkinandcheckoutwould use a mutex to lock the entire pool every time. Now, each unique connection key has its own separate mutex, so any locking is done per host, and not for the entire pool.Asynchronous connection eviction/closing
Previously, every
checkinandcheckoutwould trigger an expiration check for all connections in the pool. Thus every thread must potentially wait for connections used by other threads to be cleaned up and closed.Now this logic is removed. Eviction happens in a separate background thread periodically sweeps the connections to identify and close stale ones.
Faster (hopefully)
In my testing the new pool is faster, but only marginally so. However this is in a artificial environment using
sleepinstead of real SSH commands.@byroot – Can you test this branch and let me know how it performs against real servers?
TODO: