[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [v2 RESEND 0/4] Fix long vm downtime during live migrat
From: |
Stefan Hajnoczi |
Subject: |
Re: [Qemu-devel] [v2 RESEND 0/4] Fix long vm downtime during live migration |
Date: |
Mon, 2 Nov 2015 14:40:29 +0000 |
User-agent: |
Mutt/1.5.24 (2015-08-30) |
On Mon, Nov 02, 2015 at 03:36:59PM +0800, Liang Li wrote:
> The patch 3ea3b7fa9af067982f34b of kvm introduces a lazy collapsing
> of small sptes into large sptes mechanism, which intend to solve the
> performance drop issue if live migration fails or is canceled. The
> rmap will be scanned in the KVM_SET_USER_MEMORY_REGION ioctl context
> when dirty logging is stopped so as to drop the small sptes, scanning
> the rmap and drop the small sptes is a time consuming operation which
> will take dozens of milliseconds, the actual time depends on VM's
> memory size. For a VM with 8GB RAM, it will take about 30ms.
>
> The current QEMU code stop the dirty logging during the pause and
> copy stage by calling the migration_end() function. Now migration_end()
> is a time consuming operation because it calls
> memroy_global_dirty_log_stop(), which will trigger the scanning of rmap
> and dropping small sptes operation. So call migration_end() before all
> the vmsate data has already been transferred to the destination will
> prolong VM downtime.
>
> migration_end() should be deferred after all the data has been
> transferred to the destination. blk_mig_cleanup() can be deferred too.
>
> Effect of this patch
> ====================
> For a VM with 8G RAM, this patch can reduce the VM downtime about 30 ms.
>
> You can follow these steps to see the effect of this patch.
>
> 1. Start a VM with the command:
> ./qemu-system-x86_64 -enable-kvm -smp 4 -m 8192 -monitor stdio\
> /share/rhel6u5.qcow
> in the source host and
> ./qemu-system-x86_64 -enable-kvm -smp 4 -m 8192 -monitor stdio\
> /share/rhel6u5.qcow -incoming tcp:0:4444
> in the destination host.
> 2. In the source side qemu monitor:
> (qemu) migrate_set_speed 0
> (qemu) migrate_set_downtime 0.01
> (qemu) migrate -d tcp:($DST_HOST_IP):4444
> (qemu) info migrate
>
> The actual VM downtime in my environment:
> =====================================
> |without this patch| with this patch|
> |-----------------------------------|
> | 35ms | 4ms |
> =====================================
>
>
> Changes:
> * Remove qemu_savevm_sate_cancel() in migrate_fd_cleanup().
> * Add 2 more patches for code cleanup.
> * Add more details in the commit message.
>
> Liang Li (4):
> migration: defer migration_end & blk_mig_cleanup
> migration: rename qemu_savevm_state_cancel
> migration: rename cancel to cleanup in SaveVMHandles
> migration: code clean up
>
> include/migration/vmstate.h | 2 +-
> include/sysemu/sysemu.h | 2 +-
> migration/block.c | 10 ++--------
> migration/migration.c | 13 ++++++-------
> migration/ram.c | 10 ++--------
> migration/savevm.c | 10 +++++-----
> trace-events | 2 +-
> 7 files changed, 18 insertions(+), 31 deletions(-)
I haven't reviewed in detail, but if migration folks are happy then it's
fine by me:
Acked-by: Stefan Hajnoczi <address@hidden>
signature.asc
Description: PGP signature
- [Qemu-devel] [v2 RESEND 1/4] migration: defer migration_end & blk_mig_cleanup, (continued)
- [Qemu-devel] [v2 RESEND 1/4] migration: defer migration_end & blk_mig_cleanup, Liang Li, 2015/11/02
- [Qemu-devel] [v2 RESEND 2/4] migration: rename qemu_savevm_state_cancel, Liang Li, 2015/11/02
- [Qemu-devel] [v2 RESEND 4/4] migration: code clean up, Liang Li, 2015/11/02
- [Qemu-devel] [v2 RESEND 3/4] migration: rename cancel to cleanup in SaveVMHandles, Liang Li, 2015/11/02
- Re: [Qemu-devel] [v2 RESEND 0/4] Fix long vm downtime during live migration, Paolo Bonzini, 2015/11/02
- Re: [Qemu-devel] [v2 RESEND 0/4] Fix long vm downtime during live migration,
Stefan Hajnoczi <=
- Re: [Qemu-devel] [v2 RESEND 0/4] Fix long vm downtime during live migration, Amit Shah, 2015/11/03