-
Notifications
You must be signed in to change notification settings - Fork 2.1k
[RFC] vsock Design Doc #1044
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] vsock Design Doc #1044
Conversation
Added the vsock design proposal doc. An RFC PR will be posted to discuss it. Signed-off-by: Dan Horobeanu <[email protected]>
sboeuf
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, just a few comments.
|
|
||
| - It should maintain the Firecracker security barrier, isolating the guest from | ||
| direct access to any host resources; and | ||
| - It should support multiple communication channels; and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought the discussion on #650 led to the fact that this hybrid solution, in order to be simple and efficient, should only target a 1:1 connection between host and guest.
Did I misunderstand or did something change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, things did change: we gathered some more use-case data and figured out a single-channel approach wouldn't cut it, as it would offload too much work to the user.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok!
| When the virtio-vsock device model in Firecracker detects a connection | ||
| request coming from the guest (a VIRTIO_VSOCK_OP_REQUEST packet), it will try | ||
| to forward the connection to an AF_UNIX socket listening on the host, at | ||
| `/path/to/vsock_{CID}:{PORT}`, where `{PORT}` is the destination port, as |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think /path/to/vsock_{CID}_{PORT} would be more appropriate as a path name here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will it cause too many files under the directory if many firecracker instances are created on the same host?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The unix socket paths are relative to the firecracker jail/chroot. So, one jail per vm.
| 1. Host: At VM configuration time, add a virtio-vsock device with `CID` and | ||
| `PATH`; | ||
| 2. Host: create and listen on an AF_UNIX socket at `{PATH}_{CID}:{PORT}`; | ||
| 3. Guest: create an AF_VSOCK socket and issue a `connect()` call to `HOST_CID` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should keep CID here, since no HOST_CID has been mentioned so far.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the host CID, though (i.e. the le64 constant 2). It's the destination CID specified by outgoing packets, originating inside the guest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well sure, but you're supposed to have a single CID per guest, meaning that when you create the virtio-vsock device from Firecracker, with CID and PATH, I would expect this CID to be the same as the host CID provided through the outgoing packets.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By CID, I meant guest CID, since the host CID is just a constant. I've included the CID as part of the Unix socket name, because we didn't rule out the possibility of having multiple virtio-vsock devices per guest (even though we couldn't think of a use-case where multiple devices would be needed). Is there something enforcing the single device limit?
| 1. Host: At VM configuration time, add a virtio-vsock device with `CID` and | ||
| `PATH`; | ||
| 2. Guest: create an AF_VSOCK socket and `listen()` on `PORT`; | ||
| 3. Host: `connect()` to AF_UNIX at `{PATH}_{CID}`; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So because there's no knowledge about PORT at this time, the validation of the connect() is purely about the AF_UNIX socket, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, but the idea would be to have a wrapper for some languages (Go and Rust) which do a connect() and send() as part of the, e.g. Go style Dial() call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes makes sense. But you need to come up with a name for this type of hybrid connection, otherwise people are going to be confused.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. It is the burden of the host server to create this socket, thus signaling it is waiting for incoming connections (from the guest). We can wrap the socket creation up in something like a listen() call, in a lib that the users can rely on for Firecracker vsock communication, similar to what HyperKit does.
| `PATH`; | ||
| 2. Guest: create an AF_VSOCK socket and `listen()` on `PORT`; | ||
| 3. Host: `connect()` to AF_UNIX at `{PATH}_{CID}`; | ||
| 4. Host: `send()` *connect {PORT}* data packet on that connection; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's really putting some burden on the user shoulder here unfortunately. We will expect each host program to send this extra request. This would be worth developing some kind of package in some languages (Rust, Golang) for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the similar HyperKit implementation we implemented: https://github.com/linuxkit/virtsock/blob/master/pkg/vsock/hyperkit_darwin.go
This is a generic Go wrapper for virtio vsock and hyper-v sockets. Happy to extend this to support the firecracker interface too.
| A single-channel mechanism, that would offload multiplexing duties to the | ||
| user, similar to virtio-console. This would require the smallest | ||
| implementation effort, but would drastically increase the integration effort | ||
| needed from the Firecracker users. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, not having multiple ports being supported would be an additional burden to handle from the user guest code.
|
|
||
| An established approach to guest ↔ host communication is provided by the VirtIO | ||
| vsock device. In its current Linux implementation, this approach requires three | ||
| actors: the virtio-vsock driver (provided by the guest OS), the virtio-vosck |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo here, virtio-vsock*
|
|
||
| # Other Ideas | ||
|
|
||
| Multiple ideas have been considered and evaluated, before ariving at the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
arriving*
|
|
||
| 1. Host: At VM configuration time, add a virtio-vsock device with `CID` and | ||
| `PATH`; | ||
| 2. Host: create and listen on an AF_UNIX socket at `{PATH}_{CID}:{PORT}`; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it required for the AF_UNIX socket to be created before the VM starts/is configured, or is the requirement that the AF_UNIX socket is created before the guest issues the connect() call?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the AF_UNIX socket only needs to be present when the guest calls connect()
| running inside the guest VM, while bypassing vhost kernel code on the host. To | ||
| that end, Firecracker would implement the virtio-vsock device model, and require | ||
| that some additional userspace software (hereinafter called the host agent) be | ||
| present on the host, in order order to handle the vsock protocol specifics. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Duplicate "order"
|
|
||
| ## Ethernet Host To Ethernet Guest | ||
|
|
||
| Using traditional network interfaces to communicate between guest and host |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might want to add another link to the document covering this proposal and that you referenced before.
|
Closing in favour of #1106 |
This is a request for comments on the Firecracker vsock design document. This document proposes a technical solution to #650.
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.