Age | Commit message (Collapse) | Author |
|
The config space of some PCI devices can't be accessed when their
PEs are in frozen state. Otherwise, fenced PHB might be seen.
Those PEs are identified with flag EEH_PE_CFG_RESTRICTED, meaing
EEH_PE_CFG_BLOCKED is set automatically when the PE is put to
frozen state (EEH_PE_ISOLATED). eeh_slot_error_detail() restores
PCI device BARs with eeh_pe_restore_bars(), which then calls
eeh_ops->restore_config() to reinitialize the PCI device in
(OPAL) firmware. eeh_ops->restore_config() produces PCI config
access that causes fenced PHB. The problem was reported on below
adapter:
0001:01:00.0 0200: 14e4:168e (rev 10)
0001:01:00.0 Ethernet controller: Broadcom Corporation \
NetXtreme II BCM57810 10 Gigabit Ethernet (rev 10)
This fixes the issue by skipping eeh_pe_restore_bars() in
eeh_slot_error_detail() when EEH_PE_CFG_BLOCKED is set for the PE.
Fixes: b6541db1 ("powerpc/eeh: Block PCI config access upon frozen PE")
Cc: stable@vger.kernel.org # v4.0+
Reported-by: Manvanthara B. Puttashankar <mputtash@in.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
pcibios_set_pcie_reset_state() could be called to complete
reset request when passing through PCI device, flag
EEH_PE_ISOLATED is set before saving the PCI config sapce.
On some Broadcom adapters, EEH_PE_CFG_BLOCKED is automatically
set when the flag EEH_PE_ISOLATED is marked. It caused bogus
data saved from the PCI config space, which will be restored
to the PCI adapter after the reset. Eventually, the hardware
can't work with corrupted data in PCI config space.
The patch fixes the issue with eeh_pe_state_mark_no_cfg(), which
doesn't set EEH_PE_CFG_BLOCKED when seeing EEH_PE_ISOLATED on the
PE, in order to avoid the bogus data saved and restored to the PCI
config space.
Reported-by: Rajanikanth H. Adaveeshaiah <rajanikanth.ha@in.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
In the complete hotplug case, EEH PEs are supposed to be released
and set to NULL. Normally, this is done by eeh_remove_device(),
which is called from pcibios_release_device().
However, if something is holding a kref to the device, it will not
be released, and the PE will remain. eeh_add_device_late() has
a check for this which will explictly destroy the PE in this case.
This check in eeh_add_device_late() occurs after a call to
eeh_ops->probe(). On PowerNV, probe is a pointer to pnv_eeh_probe(),
which will exit without probing if there is an existing PE.
This means that on PowerNV, devices with outstanding krefs will not
be rediscovered by EEH correctly after a complete hotplug. This is
affecting CXL (CAPI) devices in the field.
Put the probe after the kref check so that the PE is destroyed
and affected devices are correctly rediscovered by EEH.
Fixes: d91dafc02f42 ("powerpc/eeh: Delay probing EEH device during hotplug")
Cc: stable@vger.kernel.org
Cc: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Daniel Axtens <dja@axtens.net>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
This relies on the fact that a PCI device always has an IOMMU table
which may not be the case when we get dynamic DMA windows so
let's use more reliable check for IOMMU group here.
As we do not rely on the table presence here, remove the workaround
from pnv_pci_ioda2_set_bypass(); also remove the @add_to_iommu_group
parameter from pnv_ioda_setup_bus_dma().
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Commit 28158cd "powerpc/eeh: Enhance pcibios_set_pcie_reset_state()"
introduced a fix for a problem where certain configurations could lead to
pci_reset_function() destroying the state of PCI devices other than the one
specified.
Unfortunately, the fix has a trivial bug - it calls pci_save_state() again,
when it should be calling pci_restore_state(). This corrects the problem.
Cc: Gavin Shan <gwshan@au1.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Currently, the macro IS_BRIDGE is not used any where.
This patch just removes it.
Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
The patch defines PCI error types and functions in uapi/asm/eeh.h
and exports function eeh_pe_inject_err(), which will be called by
VFIO driver to inject the specified PCI error to the indicated
PE for testing purpose.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Commit 1c509148b ("powerpc/eeh: Do probe on pci_dn") probes EEH
devices in early stage, which is reasonable to pSeries platform.
However, it's wrong for PowerNV platform because the PE# isn't
determined until the resources (IO and MMIO) are assigned to
PE in hotplug case. So we have to delay probing EEH devices
for PowerNV platform until the PE# is assigned.
Fixes: ff57b454ddb9 ("powerpc/eeh: Do probe on pci_dn")
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
When asserting reset in pcibios_set_pcie_reset_state(), the PE
is enforced to (hardware) frozen state in order to drop unexpected
PCI transactions (except PCI config read/write) automatically by
hardware during reset, which would cause recursive EEH error.
However, the (software) frozen state EEH_PE_ISOLATED is missed.
When users get 0xFF from PCI config or MMIO read, EEH_PE_ISOLATED
is set in PE state retrival backend. Unfortunately, nobody (the
reset handler or the EEH recovery functinality in host) will clear
EEH_PE_ISOLATED when the PE has been passed through to guest.
The patch sets and clears EEH_PE_ISOLATED properly during reset
in function pcibios_set_pcie_reset_state() to fix the issue.
Fixes: 28158cd ("Enhance pcibios_set_pcie_reset_state()")
Reported-by: Carol L. Soto <clsoto@us.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Tested-by: Carol L. Soto <clsoto@us.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
We can disable a THP split or a hugepage collapse by disabling irq.
We do send IPI to all the cpus in the early part of split/collapse,
and disabling local irq ensure we don't make progress with
split/collapse. If the THP is getting split we return NULL from
find_linux_pte_or_hugepte(). For all the current callers it should be ok.
We need to be careful if we want to use returned pte_t pointer outside
the irq disabled region. W.r.t to THP split, the pfn remains the same,
but then a hugepage collapse will result in a pfn change. There are
few steps we can take to avoid a hugepage collapse.One way is to take page
reference inside the irq disable region. Other option is to take
mmap_sem so that a parallel collapse will not happen. We can also
disable collapse by taking pmd_lock. Another method used by kvm
subsystem is to check whether we had a mmu_notifer update in between
using mmu_notifier_retry().
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
The recent change to the EEH probing causes a crash on Cell because
eeh_ops is NULL.
Check if EEH is enabled and if not bail out.
Fixes: ff57b454ddb9 ("powerpc/eeh: Do probe on pci_dn")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
The patch removes struct eeh_dev::dn and the corresponding helper
functions: eeh_dev_to_of_node() and of_node_to_eeh_dev(). Instead,
eeh_dev_to_pdn() and pdn_to_eeh_dev() should be used to get the
pdn, which might contain device_node on PowerNV platform.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
There are 3 EEH operations whose arguments contain device_node:
read_config(), write_config() and restore_config(). The patch
replaces device_node with pci_dn.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
Originally, EEH core probes on device_node or pci_dev to populate
EEH devices and PEs, which conflicts with the fact: SRIOV VFs are
usually enabled and created by PF's driver and they don't have the
corresponding device_nodes. Instead, SRIOV VFs have dynamically
created pci_dn, which can be used for EEH probe.
The patch reworks EEH probe for PowerNV and pSeries platforms to
do probing based on pci_dn, instead of pci_dev or device_node any
more.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
Function pcibios_set_pcie_reset_state() is possibly called by
pci_reset_function(), on which VFIO infrastructure depends to
issue reset. pcibios_set_pcie_reset_state() is issuing reset
on the parent PE of the indicated PCI device. The reset causes
state lost on all PCI devices except the indicated one as the
argument to pcibios_set_pcie_reset_state(). Also, sideband
MMIO access from guest when issuing reset would cause unexpected
EEH error.
For above two issues, the patch applies following enhancements
to pcibios_set_pcie_reset_state():
* For all PCI devices except the indicated one, save their
state prior to reset and restore state after that.
* Explicitly freeze PE prior to reset and unfreeze it after
that, in order to avoid unexpected EEH error.
Tested-by: Priya M. A <priyama2@in.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
When PE's frozen count hits maximal allowed frozen times, which is
5 currently, it will be forced to be offline permanently. Once the
PE is removed permanently, rebooting machine is required to bring
the PE back. It's not convienent when testing EEH functionality.
The patch exports the maximal allowed frozen times through debugfs
entry (/sys/kernel/debug/powerpc/eeh_max_freezes).
Requested-by: Ryan Grimm <grimm@linux.vnet.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
This is what we get in dmesg when booting a pseries guest and
the hypervisor doesn't provide EEH support.
[ 0.166655] EEH functionality not supported
[ 0.166778] eeh_init: Failed to call platform init function (-22)
Since both powernv_eeh_init() and pseries_eeh_init() already complain when
hitting an error, it is not needed to print more (especially such an
uninformative message).
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
On PowerNV platform, PHB diag-data is dumped after stopping device
drivers. In case of recursive EEH errors, the kernel is usually
crashed before dumping PHB diag-data for the second EEH error. It's
hard to locate the root cause of the second EEH error without PHB
diag-data.
The patch adds one more EEH option "eeh=early_log", which helps
dumping PHB diag-data immediately once frozen PE is detected, in
order to get the PHB diag-data for the second EEH error.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
In PCI passthrou scenario, we need simulate EEH recovery for Emulex
adapters when their ownership changes, as we did in commit 5cfb20b96
("powerpc/eeh: Emulate EEH recovery for VFIO devices"). Broadcom
BCM5719 adpaters are facing same problem and needs same cure.
Reported-by: Rajeshkumar Subramanian <rajeshkumars@in.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
The patch introduces additional flag EEH_PE_RESET to indicate the
corresponding PE is under reset. In turn, the PE retrieval bakcend
on PowerNV platform can return unfrozen state for the EEH core to
moving forward. Flag EEH_PE_CFG_BLOCKED isn't the correct one for
the purpose.
In PCI passthrou case, the problem is more worse: Guest doesn't
recover 6th EEH error. The PE is left in isolated (frozen) and
config blocked state on Broadcom adapters. We can't retrieve the
PE's state correctly any more, even from the host side via sysfs
/sys/bus/pci/devices/xxx/eeh_pe_state.
Reported-by: Rajeshkumar Subramanian <rajeshkumars@in.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
The patch refactors eeh_reset_pe() in order for:
* Varied return values for different failure cases.
* Replace pr_err() with pr_warn() and print function name.
* Coding style cleanup.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
When the PE's config space is marked as blocked, PCI config read
requests always return 0xFF's. It's pointless to collect logs in
this case.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
The flag EEH_PE_RESET indicates blocking config space of the PE
during reset time. We potentially need block PE's config space
other than reset time. So it's reasonable to replace it with
EEH_PE_CFG_BLOCKED to indicate its usage.
There are no substantial code or logic changes in this patch.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
The PEs can be organized as nested. Current implementation doesn't
dump PCI config space for subordinate devices of child PEs. However,
the frozen PE could be caused by those subordinate devices of its
child PEs.
The patch dumps PCI config space for all subordinate devices of the
problematic PE.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
When enabling EEH functionality on passed through devices (PE)
with VFIO, the devices in the PE would be removed permanently
from guest side. In that case, the PE remains frozen state.
When returning PE to host, or restarting the guest again, we
had mechanism unfreezing the PE by clearing PESTA/B frozen
bits. However, that's not enough for some adapters, which are
indicated as following "lspci" shows. Those adapters require
hot reset on the parent bus to bring their firmware back to
workable state. Otherwise, those adaptrs won't be operative
and the host (for returning case) or the guest will fail to
load the drivers for those adapters without exception.
0000:01:00.0 Ethernet controller: Emulex Corporation OneConnect \
10Gb NIC (be3) (rev 02)
0000:01:00.0 0200: 19a2:0710 (rev 02)
0001:03:00.0 Ethernet controller: Emulex Corporation OneConnect \
NIC (Lancer) (rev 10)
0001:03:00.0 0200: 10df:e220 (rev 10)
The patch adds mechanism to emulate EEH recovery (for hot reset
on parent PCI bus) on 3 gates to fix the issue: open/release one
adapter of the PE, enable EEH functionality on one adapter of the
PE.
Reported-by: Murilo Fossa Vicentini <muvic@br.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
PE would be owned by userland, which probably request PE reset
done in host side. During the reset, we should drop the PCI
config accesses to the PE with help of flag EEH_PE_RESET.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Function pcibios_set_pcie_reset_state() can be used to do PCI
reset. PCI config access during the reset usually causes EEH
errors unexpectedly. In order to avoid the EEH error, the patch
blocks PCI config access during reset with the help of flag
EEH_PE_RESET, which is similar to what we did in EEH PE reset
path.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
The patch uses eeh_unfreeze_pe() to replace the logic clearing
frozen IO and DMA, in order to simplify the code.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
When passing through PE to guest, that's possibly in frozen
state. The driver for the pass-through devices on guest side
can't be loaded successfully as reported. We already had one
gate in eeh_dev_open() to clear PE frozen state accordingly,
but that's not enough because the function is only called at
QEMU startup for once.
The patch adds another gate in eeh_pe_set_option() so that the
PE frozen state can be cleared at QEMU restart time.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
The function eeh_pci_enable() is called to apply various requests
to one particular PE: Enabling EEH, Disabling EEH, Enabling IO,
Enabling DMA, Freezing PE. When enabling IO or DMA on one specific
PE, we need check that IO or DMA isn't enabled previously. But
the condition used to do the check isn't completely correct because
one PE would be in DMA frozen state with workable IO path, or vice
versa.
The patch fixes the improper condition.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
When passing through device, its PE might have been put into frozen
state. One obvious example would be: the passed PE is forced to be
offline because of hitting maximal allowed EEH errors in userland.
In that case, the frozen state won't be cleared and then the PE is
returned back to host, which might not have chance detecting and
recovering from it.
The patch adds more check when passing through device and clear the
PE frozen state if necessary.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
The PCI devices that have been passed through are enabled before
reset, we need restore to the enabled state after reset. Otherwise,
MMIO access might be issued to disabled devices after reset and
causes exceptional recursive EEH error.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
The patch adds one more option (EEH_OPT_FREEZE_PE) to set_option()
method to proactively freeze PE, which will be issued before resetting
pass-throughed PE to drop MMIO access during reset because it's
always contributing to recursive EEH error.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
eeh_check_failure() is used to check frozen state of the PE which
owns the indicated I/O address. The argument "val" of the function
isn't used. The patch drops it and return the frozen state of the
PE as expected.
Cc: Vishal Mansur <vmansur@linux.vnet.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
When doing vfio passthrough a VF, the kernel will crash with following
message:
[ 442.656459] Unable to handle kernel paging request for data at address 0x00000060
[ 442.656593] Faulting instruction address: 0xc000000000038b88
[ 442.656706] Oops: Kernel access of bad area, sig: 11 [#1]
[ 442.656798] SMP NR_CPUS=1024 NUMA PowerNV
[ 442.656890] Modules linked in: vfio_pci mlx4_core nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6t_REJECT xt_conntrack bnep bluetooth rfkill ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw tg3 nfsd be2net nfs_acl ses lockd ptp enclosure pps_core kvm_hv kvm_pr shpchp binfmt_misc kvm sunrpc uinput lpfc scsi_transport_fc ipr scsi_tgt [last unloaded: mlx4_core]
[ 442.658152] CPU: 40 PID: 14948 Comm: qemu-system-ppc Not tainted 3.10.42yw-pkvm+ #37
[ 442.658219] task: c000000f7e2a9a00 ti: c000000f6dc3c000 task.ti: c000000f6dc3c000
[ 442.658287] NIP: c000000000038b88 LR: c0000000004435a8 CTR: c000000000455bc0
[ 442.658352] REGS: c000000f6dc3f580 TRAP: 0300 Not tainted (3.10.42yw-pkvm+)
[ 442.658419] MSR: 9000000000009032 <SF,HV,EE,ME,IR,DR,RI> CR: 28004882 XER: 20000000
[ 442.658577] CFAR: c00000000000908c DAR: 0000000000000060 DSISR: 40000000 SOFTE: 1
GPR00: c0000000004435a8 c000000f6dc3f800 c0000000012b1c10 c00000000da24000
GPR04: 0000000000000003 0000000000001004 00000000000015b3 000000000000ffff
GPR08: c00000000127f5d8 0000000000000000 000000000000ffff 0000000000000000
GPR12: c000000000068078 c00000000fdd6800 000001003c320c80 000001003c3607f0
GPR16: 0000000000000001 00000000105480c8 000000001055aaa8 000001003c31ab18
GPR20: 000001003c10fb40 000001003c360ae8 000000001063bcf0 000000001063bdb0
GPR24: 000001003c15ed70 0000000010548f40 c000001fe5514c88 c000001fe5514cb0
GPR28: c00000000da24000 0000000000000000 c00000000da24000 0000000000000003
[ 442.659471] NIP [c000000000038b88] .pcibios_set_pcie_reset_state+0x28/0x130
[ 442.659530] LR [c0000000004435a8] .pci_set_pcie_reset_state+0x28/0x40
[ 442.659585] Call Trace:
[ 442.659610] [c000000f6dc3f800] [00000000000719e0] 0x719e0 (unreliable)
[ 442.659677] [c000000f6dc3f880] [c0000000004435a8] .pci_set_pcie_reset_state+0x28/0x40
[ 442.659757] [c000000f6dc3f900] [c000000000455bf8] .reset_fundamental+0x38/0x80
[ 442.659835] [c000000f6dc3f980] [c0000000004562a8] .pci_dev_specific_reset+0xa8/0xf0
[ 442.659913] [c000000f6dc3fa00] [c0000000004448c4] .__pci_dev_reset+0x44/0x430
[ 442.659980] [c000000f6dc3fab0] [c000000000444d5c] .pci_reset_function+0x7c/0xc0
[ 442.660059] [c000000f6dc3fb30] [d00000001c141ab8] .vfio_pci_open+0xe8/0x2b0 [vfio_pci]
[ 442.660139] [c000000f6dc3fbd0] [c000000000586c30] .vfio_group_fops_unl_ioctl+0x3a0/0x630
[ 442.660219] [c000000f6dc3fc90] [c000000000255fbc] .do_vfs_ioctl+0x4ec/0x7c0
[ 442.660286] [c000000f6dc3fd80] [c000000000256364] .SyS_ioctl+0xd4/0xf0
[ 442.660354] [c000000f6dc3fe30] [c000000000009e54] syscall_exit+0x0/0x98
[ 442.660420] Instruction dump:
[ 442.660454] 4bfffce9 4bfffee4 7c0802a6 fbc1fff0 fbe1fff8 f8010010 f821ff81 7c7e1b78
[ 442.660566] 7c9f2378 60000000 60000000 e93e02c8 <e8690060> 2fa30000 41de00c4 2b9f0002
[ 442.660679] ---[ end trace a64ac9546bcf0328 ]---
[ 442.660724]
The reason is current VF is not EEH enabled.
This patch introduces a macro to convert eeh_dev to eeh_pe. By doing so, it
will prevent converting with NULL pointer.
Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
CC: Michael Ellerman <mpe@ellerman.id.au>
V3 -> V4:
1. move the macro definition from include/linux/pci.h to
arch/powerpc/include/asm/eeh.h
V2 -> V3:
1. rebased on 3.17-rc4
2. introduce a macro
3. use this macro in several other places
V1 -> V2:
1. code style and patch subject adjustment
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
The function is used by VFIO driver, which might be built as a
dynamic module. So it should be exported.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
Some new functions are exposed for use by the IOMMU code but
won't build when CONFIG_IOMMU_API isn't set, so shield them
appropriately.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
pr_warn() is equal to pr_warning(), but the former is a bit more
formal according to commit fc62f2f ("kernel.h: add pr_warn for
symmetry to dev_warn, netdev_warn").
The patch replaces pr_warning() with pr_warn().
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
The patch prints 4 PCIE or AER config registers each line, which
is part of the EEH log so that it looks a bit more compact.
Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
According to the experiment I did, PCI config access is blocked
on P7IOC frozen PE by hardware, but PHB3 doesn't do that. That
means we always get 0xFF's while dumping PCI config space of the
frozen PE on P7IOC. We don't have the problem on PHB3. So we have
to enable I/O prioir to collecting error log. Otherwise, meaningless
0xFF's are always returned.
The patch fixes it by EEH flag (EEH_ENABLE_IO_FOR_LOG), which is
selectively set to indicate the case for: P7IOC on PowerNV platform,
pSeries platform.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
There are multiple global EEH flags. Almost each flag has its own
accessor, which doesn't make sense. The patch refactors EEH flag
accessors so that they look unified:
eeh_add_flag(): Add EEH flag
eeh_clear_flag(): Clear EEH flag
eeh_has_flag(): Check if one specific flag has been set
eeh_enabled(): Check if EEH functionality has been enabled
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
Function eeh_iommu_group_to_pe() iterates each PCI device to check
the binding IOMMU group with get_iommu_table_base(), which possibly
fetches pdev->dev.archdata.dma_data.dma_offset. It's (0x1 << 59)
for "bypass" cases.
The patch fixes the issue by iterating devices hooked to the IOMMU
group and fetch IOMMU table there.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
The patch exports functions to be used by new VFIO ioctl command,
which will be introduced in subsequent patch, to support EEH
functinality for VFIO PCI devices.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Acked-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
We must not handle EEH error on devices which are passed to somebody
else. Instead, we expect that the frozen device owner detects an EEH
error and recovers from it.
This avoids EEH error handling on passed through devices so the device
owner gets a chance to handle them.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Acked-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
As Ben suggested, it's meaningful to dump PE's location code
for site engineers when hitting EEH errors. The patch introduces
function eeh_pe_loc_get() to retireve the location code from
dev-tree so that we can output it when hitting EEH errors.
If primary PE bus is root bus, the PHB's dev-node would be tried
prior to root port's dev-node. Otherwise, the upstream bridge's
dev-node of the primary PE bus will be check for the location code
directly.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
When we have the corner case of frozen parent and child PE at the
same time, we have to handle the frozen parent PE prior to the
child. Without clearning the frozen state on parent PE, the child
PE can't be recovered successfully.
The patch searches the EEH PE hierarchy tree and returns the toppest
frozen PE to be handled. It ensures the frozen parent PE will be
handled prior to child PE.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
Commit 7f52a526f ("powerpc/eeh: Allow to disable EEH") caused
following build error with "celleb_defconfig" as being catched
by Mikey on linux-next.
arch/powerpc/kernel/eeh.c: In function 'eeh_init_proc':
arch/powerpc/kernel/eeh.c:1173:37: error: 'powerpc_debugfs_root' \
undeclared (first use in this function)
arch/powerpc/kernel/eeh.c:1173:37: note: each undeclared identifier \
is reported only once for each function it appears in
Reported-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
Basically, we have 3 types of resets to fulfil PE reset: fundamental,
hot and PHB reset. For the later 2 cases, we need PCI bus reset hold
and settlement delay as specified by PCI spec. PowerNV and pSeries
platforms are running on top of different firmware and some of the
delays have been covered by underly firmware (PowerNV).
The patch makes the delays unified to be done in backend, instead of
EEH core.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
The patch introduces bootarg "eeh=off" to disable EEH functinality.
Also, it creates /sys/kerenl/debug/powerpc/eeh_enable to disable
or enable EEH functionality. By default, we have the functionality
enabled.
For PowerNV platform, we will restore to have the conventional
mechanism of clearing frozen PE during PCI config access if we're
going to disable EEH functionality. Conversely, we will rely on
EEH for error recovery.
The patch also fixes the issue that we missed to cover the case
of disabled EEH functionality in function ioda_eeh_event(). Those
events driven by interrupt should be cleared to avoid endless
reporting.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
There're 2 EEH subsystem variables: eeh_subsystem_enabled and
eeh_probe_mode. We needn't maintain 2 variables and we can just
have one variable and introduce different flags. The patch also
introduces additional flag EEH_FORCE_DISABLE, which will be used
to disable EEH subsystem via boot parameter ("eeh=off") in future.
Besides, the patch also introduces flag EEH_ENABLED, which is
changed to disable or enable EEH functionality on the fly through
debugfs entry in future.
With the patch applied, the creteria to check the enabled EEH
functionality is changed to:
!EEH_FORCE_DISABLED && EEH_ENABLED : Enabled
Other cases : Disabled
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|