Hacker News

You just don't share certain devices, like Bluetooth. The "main" kernel will probably own the boot process and manage some devices exclusively. I think the real advantage is running certain applications isolated within a CPU subset, protected/contained behind a dedicated kernel. You don't have the slowdown of VMs, or have to fight against the isolation sieve that is docker.

yjftsjthsd-h 3 days ago [ - ]

That's fine for

  - Enhanced security through kernel-level separation
  - Better resource utilization than traditional VM (KVM, Xen etc.)

but I don't think it works for

  - Improved fault isolation between different workloads
  - Potential zero-down kernel update with KHO (Kernel Hand Over)

since if the "main" kernel crashes or is supposed to get upgraded then you have to hand hardware back to it.

raron 3 days ago [ - ]

> since if the "main" kernel crashes or is supposed to get upgraded then you have to hand hardware back to it.

Isn't that similar to starting up from hibernate to disk? Basically all of your peripherals are powered off and so probably can not keep their state.

Also you can actually stop a disk (member of a RAID device), remove the PCIe-SATA HBA card it is attached to, replace it with a different one, connect all back together without any user-space application noticing it.

yjftsjthsd-h 2 days ago [ - ]

I trust hardware to mostly be reasonable when starting from off, but we're discussing the case where it's on and stays on but gets handed from one kernel to another and I don't trust it nearly as well in that case. I think the comparison is kexec rather than hibernate, and while it often works, kexec can result in misbehaving hardware.

yencabulator 6 hours ago [ - ]

Many peripherals have a mechanism to reset the device, to get it back to a known good state. Generally device drivers will do this when they receive a message they don't understand from the device, or a command sent to the device times out without response.

Here's my graphics chip getting reset:

  [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
  [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
  [drm:gfx_v11_0_hw_fini [amdgpu]] *ERROR* failed to halt cp gfx
  amdgpu 0000:c6:00.0: amdgpu: MODE2 reset
  amdgpu 0000:c6:00.0: amdgpu: GPU reset succeeded, trying to resume

3 days ago [ - ]

[deleted]

samus 3 days ago [ - ]

The old kernel boots the new kernel, possibly in a "passive" mode, performs a few sanity checks of the new instance, hands over control, and finally shuts itself down.