Age | Commit message (Collapse) | Author |
|
The expectation is that the legacy / non-standard pmem discovery method
(e820 type-12) will only ever be used to describe small quantities of
persistent memory. Larger capacities will be described via the ACPI
NFIT. When "allocate struct page from pmem" support is added this default
policy can be overridden by assigning a legacy pmem namespace to a pfn
device, however this would be only be necessary if a platform used the
legacy mechanism to define a very large range.
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Enable the pmem driver to handle PFN device instances. Attaching a pmem
namespace to a pfn device triggers the driver to allocate and initialize
struct page entries for pmem. Memory capacity for this allocation comes
exclusively from RAM for now which is suitable for low PMEM to RAM
ratios. This mechanism will be expanded later for setting an "allocate
from PMEM" policy.
Cc: Boaz Harrosh <boaz@plexistor.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Implement the base infrastructure for libnvdimm PFN devices. Similar to
BTT devices they take a namespace as a backing device and layer
functionality on top. In this case the functionality is reserving space
for an array of 'struct page' entries to be handed out through
pfn_to_page(). For now this is just the basic libnvdimm-device-model for
configuring the base PFN device.
As the namespace claiming mechanism for PFN devices is mostly identical
to BTT devices drivers/nvdimm/claim.c is created to house the common
bits.
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Given that a write-back (WB) mapping plus non-temporal stores is
expected to be the most efficient way to access PMEM, update the
definition of ARCH_HAS_PMEM_API to imply arch support for
WB-mapped-PMEM. This is needed as a pre-requisite for adding PMEM to
the direct map and mapping it with struct page.
The above clarification for X86_64 means that memcpy_to_pmem() is
permitted to use the non-temporal arch_memcpy_to_pmem() rather than
needlessly fall back to default_memcpy_to_pmem() when the pcommit
instruction is not available. When arch_memcpy_to_pmem() is not
guaranteed to flush writes out of cache, i.e. on older X86_32
implementations where non-temporal stores may just dirty cache,
ARCH_HAS_PMEM_API is simply disabled.
The default fall back for persistent memory handling remains. Namely,
map it with the WT (write-through) cache-type and hope for the best.
arch_has_pmem_api() is updated to only indicate whether the arch
provides the proper helpers to meet the minimum "writes are visible
outside the cache hierarchy after memcpy_to_pmem() + wmb_pmem()". Code
that cares whether wmb_pmem() actually flushes writes to pmem must now
call arch_has_wmb_pmem() directly.
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
[hch: set ARCH_HAS_PMEM_API=n on x86_32]
Reviewed-by: Christoph Hellwig <hch@lst.de>
[toshi: x86_32 compile fixes]
Signed-off-by: Toshi Kani <toshi.kani@hp.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
This behaves like devm_memremap except that it ensures we have page
structures available that can back the region.
Signed-off-by: Christoph Hellwig <hch@lst.de>
[djbw: catch attempts to remap RAM, drop flags]
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
While pmem is usable as a block device or via DAX mappings to userspace
there are several usage scenarios that can not target pmem due to its
lack of struct page coverage. In preparation for "hot plugging" pmem
into the vmemmap add ZONE_DEVICE as a new zone to tag these pages
separately from the ones that are subject to standard page allocations.
Importantly "device memory" can be removed at will by userspace
unbinding the driver of the device.
Having a separate zone prevents allocation and otherwise marks these
pages that are distinct from typical uniform memory. Device memory has
different lifetime and performance characteristics than RAM. However,
since we have run out of ZONES_SHIFT bits this functionality currently
depends on sacrificing ZONE_DMA.
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Jerome Glisse <j.glisse@gmail.com>
[hch: various simplifications in the arch interface]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Three architectures already define these, and we'll need them genericly
soon.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
None of the implementations currently use it. The common
bdev_direct_access() entry point handles all the size checks before
calling ->direct_access().
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
|
|
This should result in a pretty sizeable performance gain for reads. For
rough comparison I did some simple read testing using PMEM to compare
reads of write combining (WC) mappings vs write-back (WB). This was
done on a random lab machine.
PMEM reads from a write combining mapping:
# dd of=/dev/null if=/dev/pmem0 bs=4096 count=100000
100000+0 records in
100000+0 records out
409600000 bytes (410 MB) copied, 9.2855 s, 44.1 MB/s
PMEM reads from a write-back mapping:
# dd of=/dev/null if=/dev/pmem0 bs=4096 count=1000000
1000000+0 records in
1000000+0 records out
4096000000 bytes (4.1 GB) copied, 3.44034 s, 1.2 GB/s
To be able to safely support a write-back aperture I needed to add
support for the "read flush" _DSM flag, as outlined in the DSM spec:
http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf
This flag tells the ND BLK driver that it needs to flush the cache lines
associated with the aperture after the aperture is moved but before any
new data is read. This ensures that any stale cache lines from the
previous contents of the aperture will be discarded from the processor
cache, and the new data will be read properly from the DIMM. We know
that the cache lines are clean and will be discarded without any
writeback because either a) the previous aperture operation was a read,
and we never modified the contents of the aperture, or b) the previous
aperture operation was a write and we must have written back the dirtied
contents of the aperture to the DIMM before the I/O was completed.
In order to add support for the "read flush" flag I needed to add a
generic routine to invalidate cache lines, mmio_flush_range(). This is
protected by the ARCH_HAS_MMIO_FLUSH Kconfig variable, and is currently
only supported on x86.
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Signed-off-by: yalin wang <yalin.wang2010@gmail.com>
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Update the annotation for the kaddr pointer returned by direct_access()
so that it is a __pmem pointer. This is consistent with the PMEM driver
and with how this direct_access() pointer is used in the DAX code.
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Update the DAX I/O path so that all operations that store data (I/O
writes, zeroing blocks, punching holes, etc.) properly synchronize the
stores to media using the PMEM API. This ensures that the data DAX is
writing is durable on media before the operation completes.
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Add support for two new PMEM APIs, copy_from_iter_pmem() and
clear_pmem(). copy_from_iter_pmem() is used to copy data from an
iterator into a PMEM buffer. clear_pmem() zeros a PMEM memory range.
Both of these new APIs must be explicitly ordered using a wmb_pmem()
function call and are implemented in such a way that the wmb_pmem()
will make the stores to PMEM durable. Because both APIs are unordered
they can be called as needed without introducing any unwanted memory
barriers.
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Prior to this change x86_64 used the pmem defines in
arch/x86/include/asm/pmem.h, and UM used the default ones at the
top of include/linux/pmem.h. The inclusion or exclusion in linux/pmem.h
was controlled by CONFIG_ARCH_HAS_PMEM_API, but the ones in asm/pmem.h
were controlled by ARCH_HAS_NOCACHE_UACCESS.
Instead, control them both with CONFIG_ARCH_HAS_PMEM_API so that it's
clear that they are related and we don't run into the possibility where
they are both included or excluded. Also remove a bunch of stale
function prototypes meant for UM in asm/pmem.h - these just conflicted
with the inline defaults in linux/pmem.h and gave compile errors.
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Prior to this change arch_has_wmb_pmem() was only called by
arch_has_pmem_api(). Both arch_has_wmb_pmem() and arch_has_pmem_api()
checked to make sure that CONFIG_ARCH_HAS_PMEM_API was enabled.
Instead, remove the old arch_has_wmb_pmem() wrapper to be rid of one
extra layer of indirection and the redundant CONFIG_ARCH_HAS_PMEM_API
check. Rename __arch_has_wmb_pmem() to arch_has_wmb_pmem() since we no
longer have a wrapper, and just have arch_has_pmem_api() call the
architecture specific arch_has_wmb_pmem() directly.
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Move the x86 PMEM API implementation out of asm/cacheflush.h and into
its own header asm/pmem.h. This will allow members of the PMEM API to
be more easily identified on this and other architectures.
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Suggested-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
We currently register a platform device for e820 type-12 memory and
register a nvdimm bus beneath it. Registering the platform device
triggers the device-core machinery to probe for a driver, but that
search currently comes up empty. Building the nvdimm-bus registration
into the e820_pmem platform device registration in this way forces
libnvdimm to be built-in. Instead, convert the built-in portion of
CONFIG_X86_PMEM_LEGACY to simply register a platform device and move the
rest of the logic to the driver for e820_pmem, for the following
reasons:
1/ Letting e820_pmem support be a module allows building and testing
libnvdimm.ko changes without rebooting
2/ All the normal policy around modules can be applied to e820_pmem
(unbind to disable and/or blacklisting the module from loading by
default)
3/ Moving the driver to a generic location and converting it to scan
"iomem_resource" rather than "e820.map" means any other architecture can
take advantage of this simple nvdimm resource discovery mechanism by
registering a resource named "Persistent Memory (legacy)"
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Signed-off-by: Christoph Hellwig <hch@lst.de>
[djbw: tools/testing/nvdimm/ and memunmap_pmem support]
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
When a BTT is instantiated on a namespace it must validate the namespace
uuid matches the 'parent_uuid' stored in the btt superblock. This
property enforces that changing the namespace UUID invalidates all
former BTT instances on that storage. For "IO namespaces" that don't
have a label or UUID, the parent_uuid is set to zero, and this
validation is skipped. For such cases, old BTTs have to be invalidated
by forcing the namespace to raw mode, and overwriting the BTT info
blocks.
Based on a patch by Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Use arena_is_valid as a common routine for checking the validity of an
info block from both discover_arenas, and nd_btt_probe.
As a result, don't check for validity of the BTT's UUID, and lbasize.
The checksum in the BTT info block guarantees self-consistency, and when
we're called from nd_btt_probe, we don't have a valid uuid or lbasize
available to check against.
Also cleanup to return a bool instead of an int.
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Consolidate the parameters passed to arena_is_valid into just nd_btt,
and an info block to increase re-usability.
Similarly, btt_arena_write_layout doesn't need to be passed a uuid, as
it can be obtained from arena->nd_btt.
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Kill arch_memremap_pmem() and just let the architecture specify the
flags to be passed to memremap(). Default to writethrough by default.
Suggested-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
In preparation for deprecating ioremap_cache() convert its usage in
visorbus to memremap.
Cc: Benjamin Romer <benjamin.romer@unisys.com>
Cc: David Kershner <david.kershner@unisys.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Existing users of ioremap_cache() are mapping memory that is known in
advance to not have i/o side effects. These users are forced to cast
away the __iomem annotation, or otherwise neglect to fix the sparse
errors thrown when dereferencing pointers to this memory. Provide
memremap() as a non __iomem annotated ioremap_*() in the case when
ioremap is otherwise a pointer to cacheable memory. Empirically,
ioremap_<cacheable-type>() call sites are seeking memory-like semantics
(e.g. speculative reads, and prefetching permitted).
memremap() is a break from the ioremap implementation pattern of adding
a new memremap_<type>() for each mapping type and having silent
compatibility fall backs. Instead, the implementation defines flags
that are passed to the central memremap() and if a mapping type is not
supported by an arch memremap returns NULL.
We introduce a memremap prototype as a trivial wrapper of
ioremap_cache() and ioremap_wt(). Later, once all ioremap_cache() and
ioremap_wt() usage has been removed from drivers we teach archs to
implement arch_memremap() with the ability to strictly enforce the
mapping type.
Cc: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Quoting Arnd:
I was thinking the opposite approach and basically removing all uses
of IORESOURCE_CACHEABLE from the kernel. There are only a handful of
them.and we can probably replace them all with hardcoded
ioremap_cached() calls in the cases they are actually useful.
All existing usages of IORESOURCE_CACHEABLE call ioremap() instead of
ioremap_nocache() if the resource is cacheable, however ioremap() is
uncached by default. Clearly none of the existing usages care about the
cacheability. Particularly devm_ioremap_resource() never worked as
advertised since it always fell back to plain ioremap().
Clean this up as the new direction we want is to convert
ioremap_<type>() usages to memremap(..., flags).
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Preparation for uniform definition of ioremap, ioremap_wc, ioremap_wt,
and ioremap_cache, tree-wide.
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
region_is_ram() is used to prevent the establishment of aliased mappings
to physical "System RAM" with incompatible cache settings. However, it
uses "-1" to indicate both "unknown" memory ranges (ranges not described
by platform firmware) and "mixed" ranges (where the parameters describe
a range that partially overlaps "System RAM").
Fix this up by explicitly tracking the "unknown" vs "mixed" resource
cases and returning REGION_INTERSECTS, REGION_MIXED, or REGION_DISJOINT.
This re-write also adds support for detecting when the requested region
completely eclipses all of a resource. Note, the implementation treats
overlaps between "unknown" and the requested memory type as
REGION_INTERSECTS.
Finally, other memory types can be passed in by name, for now the only
usage "System RAM".
Suggested-by: Luis R. Rodriguez <mcgrof@suse.com>
Reviewed-by: Toshi Kani <toshi.kani@hp.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Fix multiple build warnings when CONFIG_BTT is not enabled:
In file included from ../drivers/nvdimm/bus.c:29:0:
../drivers/nvdimm/nd.h:169:15: warning: return type defaults to 'int' [-Wreturn-type]
static inline nd_btt_probe(struct nd_namespace_common *ndns, void *drvdata)
^
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: linux-nvdimm@lists.01.org
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
The _STA only applies to the root device, not the individual NVDIMMS,
so don't check here. NVDIMM device state flags are checked elsewhere.
Signed-off-by: Linda Knippers <linda.knippers@hp.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Based on a patch: c8fa317 brd: Request from fdisk 4k alignment by Boaz
Harrosh, allow fdisk to create properly aligned partitions for DAX. This
will also cause mkfs.ext4 to emit a warning if using a file system block
size of less than PAGE_SIZE.
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Elliott, Robert <Elliott@hp.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
Acked-by: Boaz Harrosh <boaz@plexistor.com>
Acked-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Add support for the three ARS DSM commands:
- Query ARS Capabilities - Queries the firmware to check if a given
range supports scrub, and if so, which type (persistent vs. volatile)
- Start ARS - Starts a scrub for a given range/type
- Query ARS Status - Checks status of a previously started scrub, and
provides the error logs if any.
The commands are described by the example DSM spec at:
http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf
Also add these commands to the nfit_test test framework, and return
canned data.
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
The spec suggests that this is a simple 'length' field, not a mask.
Update the name accordingly.
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Fix:
drivers/nvdimm/btt.c:635:29: warning: restricted __le64 degrades to integer
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fix from Thomas Gleixner:
"A single fix for the intel cqm perf facility to prevent IPIs from
interrupt context"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/intel/cqm: Return cached counter value from IRQ context
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
"This update contains:
- the manual revert of the SYSCALL32 changes which caused a
regression
- a fix for the MPX vma handling
- three fixes for the ioremap 'is ram' checks.
- PAT warning fixes
- a trivial fix for the size calculation of TLB tracepoints
- handle old EFI structures gracefully
This also contains a PAT fix from Jan plus a revert thereof. Toshi
explained why the code is correct"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mm/pat: Revert 'Adjust default caching mode translation tables'
x86/asm/entry/32: Revert 'Do not use R9 in SYSCALL32' commit
x86/mm: Fix newly introduced printk format warnings
mm: Fix bugs in region_is_ram()
x86/mm: Remove region_is_ram() call from ioremap
x86/mm: Move warning from __ioremap_check_ram() to the call site
x86/mm/pat, drivers/media/ivtv: Move the PAT warning and replace WARN() with pr_warn()
x86/mm/pat, drivers/infiniband/ipath: Replace WARN() with pr_warn()
x86/mm/pat: Adjust default caching mode translation tables
x86/fpu: Disable dependent CPU features on "noxsave"
x86/mpx: Do not set ->vm_ops on MPX VMAs
x86/mm: Add parenthesis for TLB tracepoint size calculation
efi: Handle memory error structures produced based on old versions of standard
|
|
Toshi explains:
"No, the default values need to be set to the fallback types,
i.e. minimal supported mode. For WC and WT, UC is the fallback type.
When PAT is disabled, pat_init() does update the tables below to
enable WT per the default BIOS setup. However, when PAT is enabled,
but CPU has PAT -errata, WT falls back to UC per the default values."
Revert: ca1fec58bc6a 'x86/mm/pat: Adjust default caching mode translation tables'
Requested-by: Toshi Kani <toshi.kani@hp.com>
Cc: Jan Beulich <jbeulich@suse.de>
Link: http://lkml.kernel.org/r/1437577776.3214.252.camel@hp.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Peter reported the following potential crash which I was able to
reproduce with his test program,
[ 148.765788] ------------[ cut here ]------------
[ 148.765796] WARNING: CPU: 34 PID: 2840 at kernel/smp.c:417 smp_call_function_many+0xb6/0x260()
[ 148.765797] Modules linked in:
[ 148.765800] CPU: 34 PID: 2840 Comm: perf Not tainted 4.2.0-rc1+ #4
[ 148.765803] ffffffff81cdc398 ffff88085f105950 ffffffff818bdfd5 0000000000000007
[ 148.765805] 0000000000000000 ffff88085f105990 ffffffff810e413a 0000000000000000
[ 148.765807] ffffffff82301080 0000000000000022 ffffffff8107f640 ffffffff8107f640
[ 148.765809] Call Trace:
[ 148.765810] <NMI> [<ffffffff818bdfd5>] dump_stack+0x45/0x57
[ 148.765818] [<ffffffff810e413a>] warn_slowpath_common+0x8a/0xc0
[ 148.765822] [<ffffffff8107f640>] ? intel_cqm_stable+0x60/0x60
[ 148.765824] [<ffffffff8107f640>] ? intel_cqm_stable+0x60/0x60
[ 148.765825] [<ffffffff810e422a>] warn_slowpath_null+0x1a/0x20
[ 148.765827] [<ffffffff811613f6>] smp_call_function_many+0xb6/0x260
[ 148.765829] [<ffffffff8107f640>] ? intel_cqm_stable+0x60/0x60
[ 148.765831] [<ffffffff81161748>] on_each_cpu_mask+0x28/0x60
[ 148.765832] [<ffffffff8107f6ef>] intel_cqm_event_count+0x7f/0xe0
[ 148.765836] [<ffffffff811cdd35>] perf_output_read+0x2a5/0x400
[ 148.765839] [<ffffffff811d2e5a>] perf_output_sample+0x31a/0x590
[ 148.765840] [<ffffffff811d333d>] ? perf_prepare_sample+0x26d/0x380
[ 148.765841] [<ffffffff811d3497>] perf_event_output+0x47/0x60
[ 148.765843] [<ffffffff811d36c5>] __perf_event_overflow+0x215/0x240
[ 148.765844] [<ffffffff811d4124>] perf_event_overflow+0x14/0x20
[ 148.765847] [<ffffffff8107e7f4>] intel_pmu_handle_irq+0x1d4/0x440
[ 148.765849] [<ffffffff811d07a6>] ? __perf_event_task_sched_in+0x36/0xa0
[ 148.765853] [<ffffffff81219bad>] ? vunmap_page_range+0x19d/0x2f0
[ 148.765854] [<ffffffff81219d11>] ? unmap_kernel_range_noflush+0x11/0x20
[ 148.765859] [<ffffffff814ce6fe>] ? ghes_copy_tofrom_phys+0x11e/0x2a0
[ 148.765863] [<ffffffff8109e5db>] ? native_apic_msr_write+0x2b/0x30
[ 148.765865] [<ffffffff8109e44d>] ? x2apic_send_IPI_self+0x1d/0x20
[ 148.765869] [<ffffffff81065135>] ? arch_irq_work_raise+0x35/0x40
[ 148.765872] [<ffffffff811c8d86>] ? irq_work_queue+0x66/0x80
[ 148.765875] [<ffffffff81075306>] perf_event_nmi_handler+0x26/0x40
[ 148.765877] [<ffffffff81063ed9>] nmi_handle+0x79/0x100
[ 148.765879] [<ffffffff81064422>] default_do_nmi+0x42/0x100
[ 148.765880] [<ffffffff81064563>] do_nmi+0x83/0xb0
[ 148.765884] [<ffffffff818c7c0f>] end_repeat_nmi+0x1e/0x2e
[ 148.765886] [<ffffffff811d07a6>] ? __perf_event_task_sched_in+0x36/0xa0
[ 148.765888] [<ffffffff811d07a6>] ? __perf_event_task_sched_in+0x36/0xa0
[ 148.765890] [<ffffffff811d07a6>] ? __perf_event_task_sched_in+0x36/0xa0
[ 148.765891] <<EOE>> [<ffffffff8110ab66>] finish_task_switch+0x156/0x210
[ 148.765898] [<ffffffff818c1671>] __schedule+0x341/0x920
[ 148.765899] [<ffffffff818c1c87>] schedule+0x37/0x80
[ 148.765903] [<ffffffff810ae1af>] ? do_page_fault+0x2f/0x80
[ 148.765905] [<ffffffff818c1f4a>] schedule_user+0x1a/0x50
[ 148.765907] [<ffffffff818c666c>] retint_careful+0x14/0x32
[ 148.765908] ---[ end trace e33ff2be78e14901 ]---
The CQM task events are not safe to be called from within interrupt
context because they require performing an IPI to read the counter value
on all sockets. And performing IPIs from within IRQ context is a
"no-no".
Make do with the last read counter value currently event in
event->count when we're invoked in this context.
Reported-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vikas Shivappa <vikas.shivappa@intel.com>
Cc: Kanaka Juvva <kanaka.d.juvva@intel.com>
Cc: Will Auld <will.auld@intel.com>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/1437490509-15373-1-git-send-email-matt@codeblueprint.co.uk
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here's a few USB and PHY fixes for 4.2-rc4.
Nothing major, the shortlog has the full details.
All of these have been in linux-next successfully"
* tag 'usb-4.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (21 commits)
USB: OHCI: fix bad #define in ohci-tmio.c
cdc-acm: Destroy acm_minors IDR on module exit
usb-storage: Add ignore-device quirk for gm12u320 based usb mini projectors
usb-storage: ignore ZTE MF 823 card reader in mode 0x1225
USB: OHCI: Fix race between ED unlink and URB submission
usb: core: lpm: set lpm_capable for root hub device
xhci: do not report PLC when link is in internal resume state
xhci: prevent bus_suspend if SS port resuming in phase 1
xhci: report U3 when link is in resume state
xhci: Calculate old endpoints correctly on device reset
usb: xhci: Bugfix for NULL pointer deference in xhci_endpoint_init() function
xhci: Workaround to get D3 working in Intel xHCI
xhci: call BIOS workaround to enable runtime suspend on Intel Braswell
usb: dwc3: Reset the transfer resource index on SET_INTERFACE
usb: gadget: udc: core: Fix argument of dma_map_single for IOMMU
usb: gadget: mv_udc_core: fix phy_regs I/O memory leak
usb: ulpi: ulpi_init should be executed in subsys_initcall
phy: berlin-usb: fix divider for BG2
phy: berlin-usb: fix divider for BG2CD
phy/pxa: add HAS_IOMEM dependency
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty/serial driver fixes from Greg KH:
"Here are a number of small serial and tty fixes for reported issues.
All have been in linux-next successfully"
* tag 'tty-4.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
tty: vt: Fix !TASK_RUNNING diagnostic warning from paste_selection()
serial: core: Fix crashes while echoing when closing
m32r: Add ioreadXX/iowriteXX big-endian mmio accessors
Revert "serial: imx: initialized DMA w/o HW flow enabled"
sc16is7xx: fix FIFO address of secondary UART
sc16is7xx: fix Kconfig dependencies
serial: etraxfs-uart: Fix release etraxfs_uart_ports
tty/vt: Fix the memory leak in visual_init
serial: amba-pl011: Fix devm_ioremap_resource return value check
n_tty: signal and flush atomically
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging driver fixes from Greg KH:
"Here are a number of iio and staging driver fixes for reported issues
for 4.2-rc4.
All have been in linux-next for a while with no problems"
* tag 'staging-4.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (34 commits)
iio:light:stk3310: make endianness independent of host
iio:light:stk3310: move device register to end of probe
iio: mma8452: use iio event type IIO_EV_TYPE_MAG
iio: mcp320x: Fix NULL pointer dereference
iio: adc: vf610: fix the adc register read fail issue
iio: mlx96014: Replace offset sign
iio: magnetometer: mmc35240: fix SET/RESET sequence
iio: magnetometer: mmc35240: Fix SET/RESET mask
iio: magnetometer: mmc35240: Fix crash in pm suspend
iio:magnetometer:bmc150_magn: output intended variable
iio:magnetometer:bmc150_magn: add regmap dependency
staging: vt6656: check ieee80211_bss_conf bssid not NULL
staging: vt6655: check ieee80211_bss_conf bssid not NULL
iio: tmp006: Check channel info on write
iio: sx9500: Add missing init in sx9500_buffer_pre{en,dis}able()
iio:light:ltr501: fix regmap dependency
iio:light:ltr501: fix variable in ltr501_init
iio: sx9500: fix bug in compensation code
iio: sx9500: rework error handling of raw readings
iio: magnetometer: mmc35240: fix available sampling frequencies
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver fixes from Greg KH:
"Here are some char and misc driver fixes for reported issues.
One parport patch is reverted as it was incorrect, thanks to testing
by the 0-day bot"
* tag 'char-misc-4.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
parport: Revert "parport: fix memory leak"
mei: prevent unloading mei hw modules while the device is opened.
misc: mic: scif bug fix for vmalloc_to_page crash
parport: fix freeing freed memory
parport: fix memory leak
parport: fix error handling
|
|
This reverts commit 23c405912b88 ("parport: fix memory leak")
par_dev->state was already being removed in parport_unregister_device().
Reported-by: Ying Huang <ying.huang@intel.com>
Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull ftrace fix from Steven Rostedt:
"Back in 3.16 the ftrace code was redesigned and cleaned up to remove
the double iteration list (one for registered ftrace ops, and one for
registered "global" ops), to just use one list. That simplified the
code but also broke the function tracing filtering on pid.
This updates the code to handle the filtering again with the new
logic"
* tag 'trace-v4.2-rc2-fix3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
ftrace: Fix breakage of set_ftrace_pid
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/djbw/nvdimm
Pull libnvdimm fix from Dan Williams:
"A minor fix for the libnvdimm subsystem.
This is not critical. The problem can be worked around in userspace
by putting the namespace temporarily into raw mode
(ndctl_namespace_set_raw_mode() from libndctl), but that is awkward
for management utilities.
* 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/nvdimm:
libnvdimm: fix namespace seed creation
|
|
Pull md fixes from Neil Brown:
"Some md fixes for 4.2
Several are tagged for -stable.
A few aren't because they are not very, serious or because they are in
the 'experimental' cluster code"
* tag 'md/4.2-fixes' of git://neil.brown.name/md:
md/raid5: clear R5_NeedReplace when no longer needed.
Fix read-balancing during node failure
md-cluster: fix bitmap sub-offset in bitmap_read_sb
md: Return error if request_module fails and returns positive value
md: Skip cluster setup in case of error while reading bitmap
md/raid1: fix test for 'was read error from last working device'.
md: Skip cluster setup for dm-raid
md: flush ->event_work before stopping array.
md/raid10: always set reshape_safe when initializing reshape_position.
md/raid5: avoid races when changing cache size.
|
|
Pull MTD fixes from Brian Norris:
"Two trivial updates. I meant to send these much earlier, but I've
been preoccupied.
- Add MAINTAINERS entry for diskonchip g3 driver
- Fix an overlooked conflict in bitfield value assignments
The latter update is a bit overdue, but there's no reason to wait any
longer"
* tag 'for-linus-20150724' of git://git.infradead.org/linux-mtd:
mtd: nand: Fix NAND_USE_BOUNCE_BUFFER flag conflict
MAINTAINERS: mtd: docg3: add docg3 maintainer
|
|
A new BLK namespace "seed" device is created whenever the current seed
is successfully probed. However, if that namespace is assigned to a BTT
it may never directly experience a successful probe as it is a
subordinate device to a BTT configuration.
The effect of the current code is that no new namespaces can be
instantiated, after the seed namespace, to consume available BLK DPA
capacity. Fix this by treating a successful BTT probe event as a
successful probe event for the backing namespace.
Reported-by: Nicholas Moulin <nicholas.w.moulin@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|