summaryrefslogtreecommitdiff
path: root/drivers/misc/cxl/main.c
AgeCommit message (Collapse)Author
2017-03-15cxl: fix nested locking hang during EEH hotplugAndrew Donnellan
commit 171ed0fcd8966d82c45376f1434678e7b9d4d9b1 upstream. Commit 14a3ae34bfd0 ("cxl: Prevent read/write to AFU config space while AFU not configured") introduced a rwsem to fix an invalid memory access that occurred when someone attempts to access the config space of an AFU on a vPHB whilst the AFU is deconfigured, such as during EEH recovery. It turns out that it's possible to run into a nested locking issue when EEH recovery fails and a full device hotplug is required. cxl_pci_error_detected() deconfigures the AFU, taking a writer lock on configured_rwsem. When EEH recovery fails, the EEH code calls pci_hp_remove_devices() to remove the device, which in turn calls cxl_remove() -> cxl_pci_remove_afu() -> pci_deconfigure_afu(), which tries to grab the writer lock that's already held. Standard rwsem semantics don't express what we really want to do here and don't allow for nested locking. Fix this by replacing the rwsem with an atomic_t which we can control more finely. Allow the AFU to be locked multiple times so long as there are no readers. Fixes: 14a3ae34bfd0 ("cxl: Prevent read/write to AFU config space while AFU not configured") Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Acked-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-03-15cxl: Prevent read/write to AFU config space while AFU not configuredAndrew Donnellan
commit 14a3ae34bfd0bcb1cc12d55b06a8584c11fac6fc upstream. During EEH recovery, we deconfigure all AFUs whilst leaving the corresponding vPHB and virtual PCI device in place. If something attempts to interact with the AFU's PCI config space (e.g. running lspci) after the AFU has been deconfigured and before it's reconfigured, cxl_pcie_{read,write}_config() will read invalid values from the deconfigured struct cxl_afu and proceed to Oops when they try to dereference pointers that have been set to NULL during deconfiguration. Add a rwsem to struct cxl_afu so we can prevent interaction with config space while the AFU is deconfigured. Reported-by: Pradipta Ghosh <pradghos@in.ibm.com> Suggested-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-19cxl: Prevent adapter reset if an active context existsVaibhav Jain
This patch prevents resetting the cxl adapter via sysfs in presence of one or more active cxl_context on it. This protects against an unrecoverable error caused by PSL owning a dirty cache line even after reset and host tries to touch the same cache line. In case a force reset of the card is required irrespective of any active contexts, the int value -1 can be stored in the 'reset' sysfs attribute of the card. The patch introduces a new atomic_t member named contexts_num inside struct cxl that holds the number of active context attached to the card , which is checked against '0' before proceeding with the reset. To prevent against a race condition where a context is activated just after reset check is performed, the contexts_num is atomically set to '-1' after reset-check to indicate that no more contexts can be activated on the card anymore. Before activating a context we atomically test if contexts_num is non-negative and if so, increment its value by one. In case the value of contexts_num is negative then it indicates that the card is about to be reset and context activation is error-ed out at that point. Fixes: 62fa19d4b4fd ("cxl: Add ability to reset the card") Cc: stable@vger.kernel.org # v4.0+ Acked-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-14cxl: Add support for interrupts on the Mellanox CX4Ian Munsie
The Mellanox CX4 in cxl mode uses a hybrid interrupt model, where interrupts are routed from the networking hardware to the XSL using the MSIX table, and from there will be transformed back into an MSIX interrupt using the cxl style interrupts (i.e. using IVTE entries and ranges to map a PE and AFU interrupt number to an MSIX address). We want to hide the implementation details of cxl interrupts as much as possible. To this end, we use a special version of the MSI setup & teardown routines in the PHB while in cxl mode to allocate the cxl interrupts and configure the IVTE entries in the process element. This function does not configure the MSIX table - the CX4 card uses a custom format in that table and it would not be appropriate to fill that out in generic code. The rest of the functionality is similar to the "Full MSI-X mode" described in the CAIA, and this could be easily extended to support other adapters that use that mode in the future. The interrupts will be associated with the default context. If the maximum number of interrupts per context has been limited (e.g. by the mlx5 driver), it will automatically allocate additional kernel contexts to associate extra interrupts as required. These contexts will be started using the same WED that was used to start the default context. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-14cxl: Add preliminary workaround for CX4 interrupt limitationIan Munsie
The Mellanox CX4 has a hardware limitation where only 4 bits of the AFU interrupt number can be passed to the XSL when sending an interrupt, limiting it to only 15 interrupts per context (AFU interrupt number 0 is invalid). In order to overcome this, we will allocate additional contexts linked to the default context as extra address space for the extra interrupts - this will be implemented in the next patch. This patch adds the preliminary support to allow this, by way of adding a linked list in the context structure that we use to keep track of the contexts dedicated to interrupts, and an API to simultaneously iterate over the related context structures, AFU interrupt numbers and hardware interrupt numbers. The point of using a single API to iterate these is to hide some of the details of the iteration from external code, and to reduce the number of APIs that need to be exported via base.c to allow built in code to call. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-14cxl: Allow a default context to be associated with an external pci_devIan Munsie
The cxl kernel API has a concept of a default context associated with each PCI device under the virtual PHB. The Mellanox CX4 will also use the cxl kernel API, but it does not use a virtual PHB - rather, the AFU appears as a physical function as a peer to the networking functions. In order to allow the kernel API to work with those networking functions, we will need to associate a default context with them as well. To this end, refactor the corresponding code to do this in vphb.c and export it so that it can be called from the PHB code. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09cxl: Adapter failure handlingChristophe Lombard
Check the AFU state whenever an API is called. The hypervisor may issue a reset of the adapter when it detects a fault. When it happens, it launches an error recovery which will either move the AFU to a permanent failure state, or in the disabled state. If the AFU is found to be disabled, detach all existing contexts from it before issuing a AFU reset to re-enable it. Before detaching contexts, notify any kernel driver through the EEH callbacks of the AFU pci device. Co-authored-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09cxl: Add guest-specific codeChristophe Lombard
The new of.c file contains code to parse the device tree to find out about cxl adapters and AFUs. guest.c implements the guest-specific callbacks for the backend API. The process element ID is not known until the context is attached, so we have to separate the context ID assigned by the cxl driver from the process element ID visible to the user applications. In bare-metal, the 2 IDs match. Co-authored-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> [mpe: Fix SMP=n build, fix PSERIES=n build, minor whitespace fixes] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09cxl: Separate bare-metal fields in adapter and AFU data structuresChristophe Lombard
Introduce sub-structures containing the bare-metal specific fields in the structures describing the adapter (struct cxl) and AFU (struct cxl_afu). Update all their references. Co-authored-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09cxl: New hcalls to support cxl adaptersChristophe Lombard
The hypervisor calls provide an interface with a coherent platform facility and function. It matches version 0.16 of the 'PAPR changes' document. The following hcalls are supported: H_ATTACH_CA_PROCESS Attach a process element to a coherent platform function. H_DETACH_CA_PROCESS Detach a process element from a coherent platform function. H_CONTROL_CA_FUNCTION Allow the partition to manipulate or query certain coherent platform function behaviors. H_COLLECT_CA_INT_INFO Collect interrupt info about a coherent. platform function after an interrupt occurred H_CONTROL_CA_FAULTS Control the operation of a coherent platform function after a fault occurs. H_DOWNLOAD_CA_FACILITY Support for downloading a base adapter image to the coherent platform facility, and for validating the entire image after the download. H_CONTROL_CA_FACILITY Allow the partition to manipulate or query certain coherent platform facility behaviors. Co-authored-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09cxl: Introduce implementation-specific APIFrederic Barrat
The backend API (in cxl.h) lists some low-level functions whose implementation is different on bare-metal and in a guest. Each environment implements its own functions, and the common code uses them through function pointers, defined in cxl_backend_ops Co-authored-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09cxl: Move bare-metal specific code to specialized filesFrederic Barrat
Move a few functions around to better separate code specific to bare-metal environment from code which will be commonly used between guest and bare-metal. Code specific to bare-metal is meant to be in native.c or pci.c only. It's basically anything which touches the card p1 registers, some p2 registers not needed from a guest and the PCI interface. Co-authored-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09cxl: Move common code away from bare-metal-specific filesChristophe Lombard
Move around some functions which will be accessed from the bare-metal and guest environments. Code in native.c and pci.c is meant to be bare-metal specific. Other files contain code which may be shared with guests. Co-authored-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-07-16cxl: Destroy cxl_adapter_idr on module_exitJohannes Thumshirn
Destroy cxl_adapter_idr on module exit, reclaiming the allocated memory. This was detected by the following semantic patch (written by Luis Rodriguez <mcgrof@suse.com>) <SmPL> @ defines_module_init @ declarer name module_init, module_exit; declarer name DEFINE_IDR; identifier init; @@ module_init(init); @ defines_module_exit @ identifier exit; @@ module_exit(exit); @ declares_idr depends on defines_module_init && defines_module_exit @ identifier idr; @@ DEFINE_IDR(idr); @ on_exit_calls_destroy depends on declares_idr && defines_module_exit @ identifier declares_idr.idr, defines_module_exit.exit; @@ exit(void) { ... idr_destroy(&idr); ... } @ missing_module_idr_destroy depends on declares_idr && defines_module_exit && !on_exit_calls_destroy @ identifier declares_idr.idr, defines_module_exit.exit; @@ exit(void) { ... +idr_destroy(&idr); } </SmPL> Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-07-10cxl: Check if afu is not null in cxl_slbiaDaniel Axtens
The pointer to an AFU in the adapter's list of AFUs can be null if we're in the process of removing AFUs. The afu_list_lock doesn't guard against this. Say we have 2 slices, and we're in the process of removing cxl. - We remove the AFUs in order (see cxl_remove). In cxl_remove_afu for AFU 0, we take the lock, set adapter->afu[0] = NULL, and release the lock. - Then we get an slbia. In cxl_slbia we take the lock, and set afu = adapter->afu[0], which is NULL. - Therefore our attempt to check afu->enabled will blow up. Therefore, check if afu is a null pointer before dereferencing it. Cc: stable@vger.kernel.org Signed-off-by: Daniel Axtens <dja@axtens.net> Acked-by: Michael Neuling <mikey@neuling.org> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-03cxl: Move include file cxl.h -> cxl-base.hMichael Neuling
This moves the current include file from cxl.h -> cxl-base.h. This current include file is used only to pass information between the base driver that needs to be built into the kernel and the cxl module. This is to make way for a new include/misc/cxl.h which will contain just the kernel API for other driver to use Signed-off-by: Michael Neuling <mikey@neuling.org> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-01-22cxl: Add tracepointsIan Munsie
This patch adds tracepoints throughout the cxl driver, which can provide insight into: - Context lifetimes - Commands sent to the PSL and AFU and their completion status - Segment and page table misses and their resolution - PSL and AFU interrupts - slbia calls from the powerpc copro_fault code These tracepoints are mostly intended to aid in debugging (particularly for new AFU designs), and may be useful standalone or in conjunction with hardware traces collected by the PSL (read out via the trace interface in debugfs) and AFUs. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-08cxl: Driver code for powernv PCIe based cards for userspace accessIan Munsie
This is the core of the cxl driver. It adds support for using cxl cards in the powernv environment only (ie POWER8 bare metal). It allows access to cxl accelerators by userspace using the /dev/cxl/afuM.N char devices. The kernel driver has no knowledge of the function implemented by the accelerator. It provides services to userspace via the /dev/cxl/afuM.N devices. When a program opens this device and runs the start work IOCTL, the accelerator will have coherent access to that processes memory using the same virtual addresses. That process may mmap the device to access any MMIO space the accelerator provides. Also, reads on the device will allow interrupts to be received. These services are further documented in a later patch in Documentation/powerpc/cxl.txt. Documentation of the cxl hardware architecture and userspace API is provided in subsequent patches. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>