-
Notifications
You must be signed in to change notification settings - Fork 26
Design documentation #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@liujing2 there is a bug when generating repositories from templates which results in the submodule rust-vmm-ci not being initialized. This is making the buildkite pipeline fail. Can you add a first commit that initializes the submodule by running: |
.gitignore
Outdated
| @@ -0,0 +1,2 @@ | |||
| /target | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The gitignore should be part of the crate-template. Can you please also submit a PR in crate-template to add it.
| Main components are listed below. | ||
| - A `Resource` trait representing for all kinds of resources. | ||
| ```rust | ||
| pub trait Resource {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since SystemAllocator will be defined in the VMM, do we still need the traits to define Resource? What advantage do they bring in this particular case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have strong idea to have it or not. Both are fine. Resource is used for ResourceAllocator, which makes resources APIs identical for VMM implementation and extending resources in future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought a bit more about this and I think there is one reason for having traits for Resource: with a Resource trait it means that vm-allocator does not need to depend on other crates (like virtio and vm-memory) for getting the actual types of the resources needed for allocating a resource. Does this assumption makes sense in the case of PCI and MSI? Ping @chao-p and @sboeuf.
I don't think that having a ResourceAllocator trait brings any value though. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, using a trait looks fine to me since it might bring some flexibility for the future, if we were wanting to add another type of resource. I don't think that without the trait we have a dependency on virtio or vm-memory, since the vm-allocator is more like a fully independent component.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the relationship between the vm-device and vm-allocator?
I think it should be that the vm-device crate depends on the vm-allocator crate.
And instead of define the Resource trait, we should actually define traits for MMIO address ranges, PIO ranges, IRQ descriptors. For example, we may define:
enum InterrupConfig {
LEGACY,
MSI{addr_high: u32, addr_low: u32, data: u32},
}
trait InterruptSource {
fn enable() -> Result<(), Error>;
fn disable() -> Result<>;
fn configure(config: InterruptConfig);
fn set_mask(masked: bool);
fn trigger(reason: u32);
fn ack(reason: u32);
}
trait InterruptAllocator {
fn allocate(count: u32) -> Result<Vec<impl InterruptSource>, Error>;
fn free(Vec<impl InterruptSource>>);
}For MMIO and PIO, it's similar with one special requirements for alignment because PCI bar enforces alignment. So the the allocate fn seems like:
trait MmioAddressAllocator {
fn allocate(size, alignment, Option<min>, Option<max>);
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jiangliu Or have 3 resource traits: an Address one that could be implemented for both MMIO and PIO, an Interrupt for MSI, INTX, etc, and an Id for allocating unique IDs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds reasonable:)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jiangliu It seems you would like to use vm-device to manage MSI functions(InterruptSource) , not only manage the allocated vectors?
| Main components are listed below. | ||
| - A `Resource` trait representing for all kinds of resources. | ||
| ```rust | ||
| pub trait Resource {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought a bit more about this and I think there is one reason for having traits for Resource: with a Resource trait it means that vm-allocator does not need to depend on other crates (like virtio and vm-memory) for getting the actual types of the resources needed for allocating a resource. Does this assumption makes sense in the case of PCI and MSI? Ping @chao-p and @sboeuf.
I don't think that having a ResourceAllocator trait brings any value though. What do you think?
|
|
||
| - A `ResourceSize` trait representing for all kinds of resource size. | ||
| ```rust | ||
| pub trait ResourceSize {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having a separate trait for the size of a resource is making the code complicated (for example in the update case where you need to send 4 parameters just to update one resource). Isn't it possible to specify that Resource should implement std::marker::Sized?
| Let VMM design a `SystemAllocator` struct because only VMM knows exactly which resources it | ||
| needs and whether it needs thread safe. | ||
|
|
||
| Meanwhile, different resource instances are not suggested to being put together using like |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is also one of the reasons why I think that we should not export a ResourceAllocator in the first place. We should make it hard to misuse the public interface of a crate.
| fn free(&mut self, address: GuestAddress, size: GuestUsize) {...} | ||
|
|
||
| /// Update an address range with a new one. | ||
| fn update(&mut self, old_resource: Option<T>, old_size: Option<S>, new_resource: T, new_size: S) {...} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the parameters this function currently has it looks more like a remove_and_replace than an update.
The update function does not make sense in the context of IdAllocator I think. Also in the context of other allocators. Another reason for not having a ResourceAllocator as a trait because not all resources can follow the same interface.
| TODO: This section should have a high-level design of the crate. | ||
| It provides different kinds of resources that might be used by VM, such as | ||
| memory-maped I/O address space, port I/O address space, legacy IRQ numbers, MSI/MSI-X | ||
| vectors, device instance id, etc. All these resources should be implemented with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about "Generic MSI/PCI MSI/PCI MSI-x IRQs"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jiangliu What would be the difference between PCI MSI and generic MSI?
| Main components are listed below. | ||
| - A `Resource` trait representing for all kinds of resources. | ||
| ```rust | ||
| pub trait Resource {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the relationship between the vm-device and vm-allocator?
I think it should be that the vm-device crate depends on the vm-allocator crate.
And instead of define the Resource trait, we should actually define traits for MMIO address ranges, PIO ranges, IRQ descriptors. For example, we may define:
enum InterrupConfig {
LEGACY,
MSI{addr_high: u32, addr_low: u32, data: u32},
}
trait InterruptSource {
fn enable() -> Result<(), Error>;
fn disable() -> Result<>;
fn configure(config: InterruptConfig);
fn set_mask(masked: bool);
fn trigger(reason: u32);
fn ack(reason: u32);
}
trait InterruptAllocator {
fn allocate(count: u32) -> Result<Vec<impl InterruptSource>, Error>;
fn free(Vec<impl InterruptSource>>);
}For MMIO and PIO, it's similar with one special requirements for alignment because PCI bar enforces alignment. So the the allocate fn seems like:
trait MmioAddressAllocator {
fn allocate(size, alignment, Option<min>, Option<max>);
}Signed-off-by: Jing Liu <[email protected]>
Signed-off-by: Jing Liu <[email protected]>
This is added. |
|
It seems we're running a bit in circle between that PR and the vm-device I'd like to use this PR to discuss about an updated design. vm-allocatorThe vm-allocator crate is managing guest resources, mainly related to devices.
The ResourcesBased on our collective (crosvm, Firecracker, cloud-hypervisor) experience and
Resource TraitsI think the few resources types that we have to manage are too different to try Instead, I think it would make more sense to define resources as an pub enum VmResource {
Address(AddressResource),
LegacyInterrupt(u32),
MsiInterrupt(u32, AddressResource),
Id(u32),
}Resource allocatorsIt should be up to each VMM to decide which, how and how many of the above The overall VMM allocator would be made of several resource allocators, and device For example, a PCI device emulation creation API could request 2 Device ManagerThe current As a consequence, the The
Device TraitDevice IO operations on a given bus can be represented by a simple pub trait Device: Send {
/// Get the device name.
fn name(&self) -> String;
/// Read from the guest physical address `addr` to `data`.
fn read(&mut self, addr: GuestAddress, data: &mut [u8], io_type: IoType);
/// Write `data` to the guest physical address `addr`.
fn write(&mut self, addr: GuestAddress, data: &[u8], io_type: IoType);IO BusOne of the main requirement for the pub struct Bus {
devices: BTreeMap<Range, Arc<Mutex<dyn Device>>>,
}Usefulness of a
|
|
On Aug 4, 2019, at 9:22 PM, Samuel Ortiz ***@***.*** ***@***.***>> wrote:
It seems we're running a bit in circle between that PR and the vm-device
one <rust-vmm/vm-device#3>.
Thanks for taking efforts to shoot this issue down.
Seems one way to break the circle is to let:
1) vm-allocator focus on managing(allocating and freeing) resources.
2) vm-device implements methods to access/control resources allocated from vm-allocator.
The IRQ has caused us much trouble in previous discussions. With above policy, it gets simpler,
vm-allocator only manages IRQ numbers, and vm-device implements legacy IRQ/MSI objects for IRQ numbers allocated from vm-allocator.
I'd like to use this PR to discuss about an updated design.
vm-allocator
The vm-allocator crate is managing guest resources, mainly related to devices.
When the VMM is responsible for directly reseving and exposing resources to a
guest, the vm-allocator crate allocates, reserves and manages resources. Its
main requirements are:
Keep track of a guest's reserved and allocated resources. In some cases, the
vm-allocator will allocate and expose resources for a guest to consume, while
in others it only keeps track of resources that are allocated from the guest.
For example, when the VMM creates virtio devices, it must allocate MMIO or PCI
and interrupt resources itself and then let the guest drivers know about them.
In other cases, like e.g. when a guest driver decides to reprogram a device,
the vm-allocator will only update its allocated resources map to match the
guest driven changes.
In any cases, the guest and the VMM views of the system resources should be
coherent.
When being asked by the VMM or the device emulation code, the vm-allocator
will either allocate and return a non overlapping set of resources or fail,
based on its coherent and full knowlege of the system resources map.
The allocator part in the vm-allocator name might be confusing as this crate
basically reserves and keeps track of resources on behalf of the guest. Naming is hard…
Vm-resources or vmm-resource-manager?
Resources
Based on our collective (crosvm, Firecracker, cloud-hypervisor) experience and
knowledge, I think we can group guest reesourcses in 3 categories:
Address space resources: The vm-allocator needs to manage the guest address
spaces. In practice, all 3 above mentioned VMMs deal with 2 address spaces:
the PIO and MMIO ones. The vm-allocator should be responsible for reserving
and allocating address ranges from those address spaces. An address space
resource could thus be defined as:
pub struct Range(pub GuestAddress, pub GuestUsize);
pub enum AddressType {
PIO,
MMIO,
}
pub struct AddressResource {
type: AddressType,
range: Range,
alignment: GuestUsize
}
Interrupt resources: Another device related resource that the vm-allocator
has to manage are interrupts. Together with memory (PIO and MMIO), interrupts
entirely define a device resources.
Devices can trigger interrupts either by toggling physical lines or by
writing to specific memory locations. The former are legacy interrupts and
are defined through a system wide list of IRQ numbers also know as GSIs. The
latter are Message Signalled Interrupts (MSIs) and are defined by MSI vectors
that include a GSI and a memory address. Thus an interrupt resource could be
defined ad:
pub struct InterruptResource {
gsi: u32,
address: Option<AddressResource>,
}
or:
pub struct LegacyInterruptResource {
gsi: u32,
}
pub struct MsiInterruptResource {
gsi: u32,
address: AddressResource,
}
I have some different ideas here.
1) in vm-allocator,
pub struct InterruptResource(u32);
2) in vm-device,
pub trait InterruptSource {
fn trigger(source: u32);
fn ack(source: u32);
fn mask();
fn unmask();
….
}
pub struct MsiIrqManager {
address: AddressResource,
vmfd: KvmVmFd( or RawFd),
}
Impl MsiIrqManager {
fn new(address: AddressResource) -> MsiIrqManager {…}
fn create_irq(irq: InterruptResource) -> MsiInterruptSource {…}
}
pub struct MsiInterruptSource {
irq: InterruptResource,
irqfd: RawFd,
mgr: Arc<MsiIrqManager>
}
impl InterruptSource for MsiInterruptSource {
}
pub struct LegacyInterruptSource {
irq: InterruptResource,
status: Atomic<u32>,
irqfd: RawFd,
eventfd: RawFd,
}
impl InterruptSource for LegacyInterruptSource {
}
Id resources: The vm-allocator could also keep track of internal, guest
unique IDs for example to uniquely identify resources.
Resource Traits
I think the few resources types that we have to manage are too different to try
to put them under the same interface or Trait. Moreover, serialization of those
resources will be easier by not being Traits.
Instead, I think it would make more sense to define resources as an Enum,
based on the above defined structures:
pub enum VmResource {
Address(AddressResource),
LegacyInterrupt(u32),
MsiInterrupt(u32, AddressResource),
Suggest to merge LegacyInterrupt and MsiInterrupt:)
Id(u32),
}
Resource allocators
It should be up to each VMM to decide which, how and how many of the above
described resources it wants to manage and allocate through the vm-allocator
crate. As such I believe it would make sense to define resource allocator traits
for each one of the above described resources: AddressAllocator, LegacyInterruptAllocator, etc.
The overall VMM allocator would be made of several resource allocators, and device
emulation implementations would require a specific set of allocators depending on the resources
they'd need to allocate and update.
For example, a PCI device emulation creation API could request 2 AddressAllocators (one PIO, one MMIO), and 1 MsiInterruptAllocator.
Device Manager
The current vm-device PR tightly couples the device model to the vm-allocator
crate, by passing a complete SystemAllocator to the DeviceManager creation
API. The idea behind the current design is that the DeviceManager is responsible for
doing the resource allocations, and some concerns have been raised about that
approach. It may make more sense to separate the device management from the
resource management to give more flexibility to VMM implementations.
As a consequence, the vm-allocator and vm-device logical coupling could be
made a lot weaker by having the DeviceManager API only consume resources that have
been allocated, reserved and updated somewhere else (typically by the VMM or the
device emulation code).
The DeviceManager requirements is two-fold:
Keep track of all registered devices and link them back to their resources.
We want to be able to have once central point of knowledge for the guest
devices and their associated resources.
Handle all device IO. For each IO VM exit (PIO or MMIO), the DeviceManager
should be able to resolve the VM exit address back to a registered device.
It will then forward the VM exit data back to the device.
Device Trait
Device IO operations on a given bus can be represented by a simple Device or
BusDevice trait:
pub trait Device: Send {
/// Get the device name.
fn name(&self) -> String;
/// Read from the guest physical address `addr` to `data`.
fn read(&mut self, addr: GuestAddress, data: &mut [u8], io_type: IoType);
/// Write `data` to the guest physical address `addr`.
fn write(&mut self, addr: GuestAddress, data: &[u8], io_type: IoType);
IO Bus
One of the main requirement for the DeviceManager is to handle IO VM exits.
In practice, there are 2 kinds of IO VM exits: PIO and MMIO ones. Although we
could define a Bus trait, I believe that in practice we'd end up always
handling 2 of them and not more. For the sake of simplicity, it may make more
sense to only define a Bus structure:
pub struct Bus {
devices: BTreeMap<Range, Arc<Mutex<dyn Device>>>,
}
Suggest using devices: BTreeMap<Range, Arc<dyn Device>>, the Device Trait objects make decisions about how to themselves from concurrent access.
Usefulness of a Bus Trait to be discussed...
DeviceManager and vm-allocator relationship
In that updated design, the DeviceManager would no longer be responsible for
allocating devices resources. Instead, it would be up to the main VMM code to
do so. The main VMM code would:
Create all necessary devices.
Allocate all device resources, be it directly or through the device creation
routines. Eventually, when creating a device, a VMM knows which resources
it can allocates up front.
Register and unregister all created devices, with their related resources
allocated at device creation time.
Some devices get their resources allocated or updated by the guest directly,
e.g. for PCI MSIs or for PCI reprogramming. In those case, the DeviceManager
API should be notified (by the device emulation code) about such changes.
We need to specify the way for device objects to notify device managers, it may get hard to implement when taking rust ownership and crate dependencies into account.
DeviceManager API
Based on all the above mentioned assumptions and proposals, the simplified
DeviceManager API would look like:
impl DeviceManager {
/// Create a new `DeviceManager`
pub fn new() -> Self
/// Register a new device with its parent and resources.
pub fn register_device(
&mut self,
dev: Arc<Mutex<dyn Device>>,
parent: Option<Arc<Mutex<dyn Device>>>,
resources: &mut Vec<VmResource>,
) -> Result<(u32)> {
/// Unregister a device from `DeviceManager`.
pub fn unregister_device(&mut self, instance_id: u32) -> Result<()> {
/// Update a device.
pub fn update_device(
&mut self,
instance_id: u32,
new_resources: &mut Vec<VmResource>,
)
}
Still not clear about the way to make use of the update_device() interface.
Let imagine the flow to reprogram a PCI bar.
1) guest writes to the bar register in PCI configuration space
2) guest traps into VMM
3) the device driver figures out it’s a PCI reprogram request
4) the device driver needs to reallocate/reserve the MMIO resource configured by the guest OS. Essentially the device driver needs to call into the device manager/vmm/resource allocator. But the ownership hierarchy is:
Vmm ----> vm_allocator
|---> device_manager —> device_object
So we need to provide a way for the device_object to call into the vm_allocator.
This issue has troubled me for a long time:(
… The only potential dependencies between vm-device and vm-allocator would be
through the VmResource definition.
As a consequence, a minimal VMM could decice to entirely avoid defining and
using resource allocators and still be able to use the DeviceManager API.
@liujing2 <https://github.com/liujing2> @andreeaflorescu <https://github.com/andreeaflorescu> @jiangliu <https://github.com/jiangliu> @sboeuf <https://github.com/sboeuf> WDYT? Based on your experience, and if you think this looks like a simple enough, reasonable base to start from, let's try to describe what would prevent such proposal from being used in our VMMs.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub <#1?email_source=notifications&email_token=AAOXR7BXO352KW32Y34HSTLQC3JY3A5CNFSM4IHY7YG2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3QBWGA#issuecomment-518003480>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAOXR7DQDR7SLVEAJ74IASLQC3JY3ANCNFSM4IHY7YGQ>.
|
|
@sameo Thank you for the detailed clarification! |
No description provided.