View Issue Details

IDProjectCategoryView StatusLast Update
0000005Xen made easyxenpublic2013-05-26 12:50
ReporterGordan BobicAssigned ToSteven Haigh 
PrioritynormalSeveritymajorReproducibilityalways
Status closedResolutionfixed 
Platformx86-64OSLinuxOS VersionEL6
Summary0000005: xen-hypervisor >= 4.2.1-7 fails to work with PCI passthrough devices
Descriptionxen-hypervisor >= 4.2.1-7 doesn't allow VMs with PCI passthrough devices to start. Instead of the vm starting, it produces the error:

22, 'Invalid argument'
Steps To Reproduce1) Set up a VM with PCI passthrough devices (e.g. PCI network card and ATI VGA card), while running xen-hypervisor 4.2.1-6. Everything works fine.

2) Upgrade xen-hypervisor package to 4.2.1-7 (upgrading the rest makes no difference either way), reboot.

3) Try to start the VM - it will fail with error 22, 'Invalid argument'.

4) Downgrade xen-hypervisor to 4.2.1-6, it will work again.
Additional InformationI am currently running /boot/xen.gz from xen-hypervisor 4.2.1-6, with the rest of the stack at 4.2.2-1. The regression persists to xen-hypervisor 4.2.2-1.

Attached are all the logs as per the xen bug reporting guidelines.
TagsNo tags attached.
External Reference

Activities

Gordan Bobic

2013-04-30 07:45

reporter  

xen-log-broken.tar.gz (28,378 bytes)

Gordan Bobic

2013-04-30 07:57

reporter  

xen-log-working.tar.gz (30,845 bytes)

Steven Haigh

2013-04-30 12:15

administrator   ~0000032

Thanks for the report. As you've got it narrowed down to between two versions where the only difference is 2 x XSA fixes - narrowing it down to a single patch will be most helpful.

From there, I can take the info and put it all together for the Xen guys to try and get the issue resolved in 4.2.2 - more than likely with a patch.

Gordan Bobic

2013-05-01 08:23

reporter   ~0000033

I have done a bit of extra investigating. The problem appears to be caused by the xsa46-4.2.patch.

4.2.1-6 - works
4.2.1-6 + xsa44-4.2.patch - still works
4.2.1-6 + xsa46-4.2.patch - doesn't work
4.2.1-6 + xsa44-4.2.patch + xsa46-4.2.patch = 4.2.1-7 - doesn't work

It seems fairly conclusive that the problem is caused by something in xsa46-4.2.patch.

I notice that these are no longer separate patches in 4.2.2. I haven't tried reverse patching xsa46-4.2.patch out of it yet. Any thoughts on this in the interest of working PCI passthrough, until the issue is resolved?

Steven Haigh

2013-05-01 12:59

administrator   ~0000034

Can you please attach or paste as a note the following:
* DomU config file
* /etc/grub.conf from the Dom0
* Any config associated with the pass thru configuration

I'll follow this up with xen-devel and want to make this all complete.

Gordan Bobic

2013-05-01 17:36

reporter   ~0000035

I am not using grub - the machine is network booted using PXE. The pxeboot configuration is:

=====
label xen
        kernel mboot.c32
        append xen.gz noreboot --- vmlinuz-3.8.8-1.el6xen.0.x86_64 rd_NO_LUKS rd_NO_LVM rd_NO_MD rd_NO_DM root=nfs:10.2.0.10:/nfsroot/normandy,rw,proto=tcp,noatime,nolock,nocto,actimeo=300 ip=eth0:dhcp selinux=0 intel_iommu=on elevator=deadline iomem=relaxed --- initramfs-3.8.8-1.el6xen.0.x86_64.img
=====

Before starting the VM, the devices are detached from dom0 using the following script:

=====
#!/bin/bash

modprobe xen-pciback

# Marvell network card
virsh nodedev-detach pci_0000_02_00_0

# ATI Radeon 6450 + HDMI audio
virsh nodedev-detach pci_0000_0c_00_0
virsh nodedev-detach pci_0000_0c_00_1
=====

I have not been able to establish where the guest VM config file is stored - I initially created it using virt-manager. The closest to guest config file I can provide is what is in the xend.log I attached earlier. Can you can tell me where the domU config file might be?

Steven Haigh

2013-05-01 21:38

administrator  

XSA-46-debug-v2.patch (409 bytes)
diff --git a/xen/common/domctl.c b/xen/common/domctl.c
index cbc8146..7794298 100644
--- a/xen/common/domctl.c
+++ b/xen/common/domctl.c
@@ -902,6 +902,8 @@ long do_domctl(XEN_GUEST_HANDLE(xen_domctl_t) u_domctl)
         else
             ret = pirq_deny_access(d, pirq);
 
+        printk("**DBG perms { %u, %d } = %ld\n", pirq, allow, ret);
+
         rcu_unlock_domain(d);
     }
     break;
XSA-46-debug-v2.patch (409 bytes)

Steven Haigh

2013-05-01 21:39

administrator   ~0000036

Hi Gordan,

Can you please add this patch in to replace the xsa46 patch and then include the output of 'xm dmesg' when the domu is started.

From Andrew Cooper on the xen-devel list:

Xend is failing a xc.domain_irq_permission() call. As the toolstack
side of things have not changed, it must be the changed in the
hypervisor which are causing the issues.

Can you please try the attached patch, and pass along xl dmesg in the
failing case?

~Andrew

Steven Haigh

2013-05-01 21:39

administrator   ~0000037

Last edited: 2013-05-01 21:41

View 3 revisions

Sorry - correction to the above note!

Please add this patch AFTER the xsa46 patch. Thinking further, this may also apply cleanly to xen-4.2.2-1 as this includes the xsa46 patch.

Steven Haigh

2013-05-03 11:10

administrator   ~0000038

Sorry to hassle you, but have you had a chance to look at this yet?

I'm gearing up to release another xen package - and if we can nail this down I'd like to include it in the next release - instead of having another release shortly afterwards...

Gordan Bobic

2013-05-03 16:58

reporter   ~0000039

Apologies, for the delay, I haven't had a chance to try it yet. I will try the patch tonight or tomorrow at the latest.

Gordan Bobic

2013-05-04 07:33

reporter  

xm-dmesg.log (8,095 bytes)

Gordan Bobic

2013-05-04 07:34

reporter   ~0000040

Applied the patch provided, tried to start the VM. xm dmesg output is attached as xm-dmesg.log.

Steven Haigh

2013-05-05 08:31

administrator  

XSA-46-xen-4.2-debug-v3.patch (711 bytes)
diff --git a/xen/common/domctl.c b/xen/common/domctl.c
index b3bfb38..be30cf3 100644
--- a/xen/common/domctl.c
+++ b/xen/common/domctl.c
@@ -908,6 +908,16 @@ long do_domctl(XEN_GUEST_HANDLE(xen_domctl_t) u_domctl)
         else
             ret = pirq_deny_access(d, pirq);
 
+        printk("**DBG perms { %u, %d } = %ld\n", pirq, allow, ret);
+        if ( ret )
+        {
+            printk(" Domain %"PRId16", nr_pirqs %d\n",
+                   d->domain_id, d->nr_pirqs);
+            printk(" dom_pirq_to_irq(%d) = %d\n",
+                   pirq, domain_pirq_to_irq(d, pirq));
+            rangeset_domain_printk(d);
+        }
+
         rcu_unlock_domain(d);
     }
     break;

Steven Haigh

2013-05-05 08:31

administrator   ~0000044

From Gordon's log, it appears pirq 34 is the one causing problems.

Can he please try the latest attached debugging patch which should
provide rather more information in the failure case.

Also, can he boot with "loglvl=all" on the Xen command line, and also
issue "xm debug-keys izq" before capturing xm dmesg. The debug keys
should dump loads on information into the dmesg buffer to do with
interrupts etc.

Thanks,

~Andrew

Gordan Bobic

2013-05-05 18:36

reporter  

xm-dmesg-debug-3.log (56,555 bytes)

Gordan Bobic

2013-05-05 18:36

reporter   ~0000046

New debug log attached.

Steven Haigh

2013-05-09 11:25

administrator   ~0000054

This is also being combined with another report on the xen-devel mailing list.

See this (partial) thread:
http://lists.xen.org/archives/html/xen-devel/2013-05/msg00093.html

Steven Haigh

2013-05-16 08:49

administrator   ~0000060

Just to make sure I'm up to date with this.... Is the current status that patches are being created on the xen-devel list for extra debugging to narrow down the root cause of this problem?

Gordan Bobic

2013-05-16 08:56

reporter   ~0000061

Yes. The last patch posted there was an attempt to fix the problem. I'm going to re-build 4.1.2-7 with the patch again just to make sure and report back on xen-devel with the logs for further analysis.

Gordan Bobic

2013-05-20 06:07

reporter   ~0000062

I can confirm that this issue is fixed by the Jan Beulich's patch here:
http://lists.xen.org/archives/html/xen-devel/2013-05/msg01496.html

Steven Haigh

2013-05-20 10:52

administrator  

xsa46-fix.patch (870 bytes)
--- a/tools/libxc/xc_physdev.c
+++ b/tools/libxc/xc_physdev.c
@@ -49,7 +49,7 @@ int xc_physdev_map_pirq(xc_interface *xc
     map.domid = domid;
     map.type = MAP_PIRQ_TYPE_GSI;
     map.index = index;
-    map.pirq = *pirq;
+    map.pirq = *pirq < 0 ? index : *pirq;
 
     rc = do_physdev_op(xch, PHYSDEVOP_map_pirq, &map, sizeof(map));
 
--- a/tools/python/xen/xend/server/pciif.py
+++ b/tools/python/xen/xend/server/pciif.py
@@ -340,7 +340,7 @@ class PciController(DevController):
                 raise VmError(('pci: failed to configure I/O memory on device '+
                             '%s - errno=%d')%(dev.name,rc))
 
-        if not self.vm.info.is_hvm() and dev.irq:
+        if dev.irq > 0:
             rc = xc.physdev_map_pirq(domid = fe_domid,
                                    index = dev.irq,
                                    pirq  = dev.irq)
xsa46-fix.patch (870 bytes)

Steven Haigh

2013-05-20 10:54

administrator   ~0000063

Attached patch that fixes this (2 confirmed reports).

Waiting for feedback from the Citrix guys as to if this will be included in newer releases or if an official patch will be distributed etc...

Upon receiving this info, I'll roll new packages with this fix + XSA56 for distribution (unless Citrix handle these fixes some other way).

Steven Haigh

2013-05-21 11:09

administrator   ~0000064

Building 4.2.2-5 with this fix included. Also includes XSA56 fix. Testing packages available soon to confirm it is properly fixed.

Steven Haigh

2013-05-21 12:50

administrator   ~0000065

Initial tests of 4.2.2-5 seem to look good.

Marking fixed. Please post feedback.

Issue History

Date Modified Username Field Change
2013-04-30 07:45 Gordan Bobic New Issue
2013-04-30 07:45 Gordan Bobic Status new => assigned
2013-04-30 07:45 Gordan Bobic Assigned To => Steven Haigh
2013-04-30 07:45 Gordan Bobic File Added: xen-log-broken.tar.gz
2013-04-30 07:57 Gordan Bobic File Added: xen-log-working.tar.gz
2013-04-30 12:15 Steven Haigh Note Added: 0000032
2013-04-30 12:15 Steven Haigh Status assigned => acknowledged
2013-05-01 08:23 Gordan Bobic Note Added: 0000033
2013-05-01 12:59 Steven Haigh Note Added: 0000034
2013-05-01 17:36 Gordan Bobic Note Added: 0000035
2013-05-01 21:38 Steven Haigh File Added: XSA-46-debug-v2.patch
2013-05-01 21:39 Steven Haigh Note Added: 0000036
2013-05-01 21:39 Steven Haigh Note Added: 0000037
2013-05-01 21:40 Steven Haigh Note Edited: 0000037 View Revisions
2013-05-01 21:41 Steven Haigh Note Edited: 0000037 View Revisions
2013-05-03 11:10 Steven Haigh Note Added: 0000038
2013-05-03 16:58 Gordan Bobic Note Added: 0000039
2013-05-04 07:33 Gordan Bobic File Added: xm-dmesg.log
2013-05-04 07:34 Gordan Bobic Note Added: 0000040
2013-05-05 08:31 Steven Haigh File Added: XSA-46-xen-4.2-debug-v3.patch
2013-05-05 08:31 Steven Haigh Note Added: 0000044
2013-05-05 18:36 Gordan Bobic File Added: xm-dmesg-debug-3.log
2013-05-05 18:36 Gordan Bobic Note Added: 0000046
2013-05-09 11:25 Steven Haigh Note Added: 0000054
2013-05-16 08:49 Steven Haigh Note Added: 0000060
2013-05-16 08:56 Gordan Bobic Note Added: 0000061
2013-05-20 06:07 Gordan Bobic Note Added: 0000062
2013-05-20 10:52 Steven Haigh File Added: xsa46-fix.patch
2013-05-20 10:54 Steven Haigh Note Added: 0000063
2013-05-20 22:49 Steven Haigh Status acknowledged => confirmed
2013-05-20 22:49 Steven Haigh Description Updated View Revisions
2013-05-20 22:49 Steven Haigh Steps to Reproduce Updated View Revisions
2013-05-20 22:49 Steven Haigh Additional Information Updated View Revisions
2013-05-21 11:09 Steven Haigh Note Added: 0000064
2013-05-21 12:50 Steven Haigh Note Added: 0000065
2013-05-21 12:50 Steven Haigh Status confirmed => resolved
2013-05-21 12:50 Steven Haigh Resolution open => fixed
2013-05-26 12:50 Steven Haigh Status resolved => closed