summaryrefslogtreecommitdiff
path: root/drivers/base
AgeCommit message (Collapse)Author
2013-06-03firmware: move EXPORT_SYMBOL annotationsDaniel Mack
Move EXPORT_SYMBOL annotations so they follow immediately after the closing function brace line. Signed-off-by: Daniel Mack <zonque@gmail.com> Acked-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-06-03firmware: Avoid deadlock of usermodehelper lock at shutdownTakashi Iwai
When a system goes to reboot/shutdown, it tries to disable the usermode helper via usermodehelper_disable(). This might be blocked when a driver tries to load a firmware beforehand and it's stuck by some reason. For example, dell_rbu driver loads the firmware in non-hotplug mode and waits for user-space clearing the loading sysfs flag. If user-space doesn't clear the flag, it waits forever, thus blocks the reboot, too. As a workaround, in this patch, the firmware class driver registers a reboot notifier so that it can abort all pending f/w bufs before issuing usermodehelper_disable(). Signed-off-by: Takashi Iwai <tiwai@suse.de> Acked-by: Ming Lei <ming.lei@canonical.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-06-03PM / Runtime: Rework the "runtime idle" helper routineRafael J. Wysocki
The "runtime idle" helper routine, rpm_idle(), currently ignores return values from .runtime_idle() callbacks executed by it. However, it turns out that many subsystems use pm_generic_runtime_idle() which checks the return value of the driver's callback and executes pm_runtime_suspend() for the device unless that value is not 0. If that logic is moved to rpm_idle() instead, pm_generic_runtime_idle() can be dropped and its users will not need any .runtime_idle() callbacks any more. Moreover, the PCI, SCSI, and SATA subsystems' .runtime_idle() routines, pci_pm_runtime_idle(), scsi_runtime_idle(), and ata_port_runtime_idle(), respectively, as well as a few drivers' ones may be simplified if rpm_idle() calls rpm_suspend() after 0 has been returned by the .runtime_idle() callback executed by it. To reduce overall code bloat, make the changes described above. Tested-by: Mika Westerberg <mika.westerberg@linux.intel.com> Tested-by: Kevin Hilman <khilman@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Kevin Hilman <khilman@linaro.org> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Acked-by: Alan Stern <stern@rowland.harvard.edu>
2013-06-03Merge remote-tracking branch 'regmap/fix/debugfs' into regmap-linusMark Brown
2013-06-03Merge remote-tracking branch 'regmap/fix/cache' into regmap-linusMark Brown
2013-06-03regmap: core: Cache all registers by default when cache is enabledMark Brown
Currently all register maps with a cache need to provide a volatile callback since the default is to assume all registers are volatile. This is not sensible if we have a cache so change the default to be fully cached if a cache is provided. Signed-off-by: Mark Brown <broonie@linaro.org>
2013-06-03regmap: Implemented default cache sync operationMaarten ter Huurne
This can be used for cache types for which syncing values one by one is equally efficient as syncing a range, such as the flat cache. Signed-off-by: Maarten ter Huurne <maarten@treewalker.org> Signed-off-by: Mark Brown <broonie@linaro.org>
2013-06-01Driver core / MM: Drop offline_memory_block()Rafael J. Wysocki
Since offline_memory_block(mem) is functionally equivalent to device_offline(&mem->dev), make the only caller of the former use the latter instead and drop offline_memory_block() entirely. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Toshi Kani <toshi.kani@hp.com>
2013-06-01Driver core / memory: Simplify __memory_block_change_state()Rafael J. Wysocki
As noted by Tang Chen, the last_online field in struct memory_block introduced by commit 4960e05 (Driver core: Introduce offline/online callbacks for memory blocks) is not really necessary, because online_pages() restores the previous state if passed ONLINE_KEEP as the last argument. Therefore, remove that field along with the code referring to it. References: http://marc.info/?l=linux-kernel&m=136919777305599&w=2 Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Tang Chen <tangchen@cn.fujitsu.com>
2013-06-01regmap: rbtree: Fixed node range check on syncMaarten ter Huurne
A node starting before the minimum register is no reason to reject it, since its end could be in range. The check for the end already exists two lines lower, so we can just remove the incorrect check. Signed-off-by: Maarten ter Huurne <maarten@treewalker.org> Signed-off-by: Mark Brown <broonie@linaro.org>
2013-05-29CPU: Fix sysfs cpu/online of offlined CPUsToshi Kani
As reported by Dave Hansen, sysfs cpu/online shows 1 for offlined CPUs at boot. Fix this problem by initializing dev.offline with cpu_online() when registering a CPU. References: https://lkml.org/lkml/2013/5/29/403 Reported-and-tested-by: Dave Hansen <dave.hansen@intel.com> Signed-off-by: Toshi Kani <toshi.kani@hp.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-05-27Merge 3.10-rc3 into driver-core-nextGreg Kroah-Hartman
We want the changes here. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-25regmap: Make regmap-mmio usable from atomic contextsLars-Peter Clausen
regmap-mmio uses a spinlock with spin_lock() and spin_unlock() for locking. To be able to use the regmap API from different contexts (atomic vs non-atomic), without the risk of race conditions, we need to use spin_lock_irqsave() and spin_lock_irqrestore() instead. A new field, the spinlock_flags field, is added to regmap struct to store the flags between regmap_{,un}lock_spinlock(). The spinlock_flags field itself is also protected by the spinlock. Thanks to Stephen Warren for the suggestion of this particular solution. Signed-off-by: Lars-Peter Clausen <lars@metafoo.de> Reviewed-by: Stephen Warren <swarren@nvidia.com> Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2013-05-23Merge tag 'driver-core-3.10-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core fixes from Greg Kroah-Hartman: "Here are 3 tiny driver core fixes for 3.10-rc2. A needed symbol export, a change to make it easier to track down offending sysfs files with incorrect attributes, and a klist bugfix. All have been in linux-next for a while" * tag 'driver-core-3.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: klist: del waiter from klist_remove_waiters before wakeup waitting process driver core: print sysfs attribute name when warning about bogus permissions driver core: export subsys_virtual_register
2013-05-23regmap: regcache: Fixup locking for custom lock callbacksLars-Peter Clausen
The parameter passed to the regmap lock/unlock callbacks needs to be map->lock_arg, regcache passes just map. This works fine in the case that no custom locking callbacks are used since in this case map->lock_arg equals map, but will break when custom locking callbacks are used. The issue was introduced in commit 0d4529c5("regmap: make lock/unlock functions customizable") and is fixed by this patch. Signed-off-by: Lars-Peter Clausen <lars@metafoo.de> Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2013-05-23regmap: regcache: Fixup locking for custom lock callbacksLars-Peter Clausen
The parameter passed to the regmap lock/unlock callbacks needs to be map->lock_arg, regcache passes just map. This works fine in the case that no custom locking callbacks are used, since in this case map->lock_arg equals map, but will break when custom locking callbacks are used. The issue was introduced in commit 0d4529c5 ("regmap: make lock/unlock functions customizable") and is fixed by this patch. Signed-off-by: Lars-Peter Clausen <lars@metafoo.de> Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2013-05-21cpu: make sure that cpu/online file created before KOBJ_ADD is emittedIgor Mammedov
It fixes race between udev and hotplugged CPU registration by defining "online" attribute statically, so that device_add() would create it before notifying udev about new CPU. Original issue report is at https://lkml.org/lkml/2012/4/30/198 " > On Mon, Apr 30, 2012 at 11:36:23AM -0400, Konrad Rzeszutek Wilk wrote: > > Hey Greg, > > > > Hoping you can help with some guidance on how to fix this. > > > > The issue is with CPU hotplug is that when a CPU goes up > > it calls 'arch_register_cpu' which eventually calls > > register_cpu. That function does these two things: > > > > 251 error = device_register(&cpu->dev); > > 252 if (!error && cpu->hotpluggable) > > 253 register_cpu_control(cpu); > > > > and the device_register creates a nice little SysFS directory: > > > > /sys/devices/system/cpu/cpu2/ which at line 251 has the 'add' attribute > > but no 'online' attribute. udev then tries to echo 1 to the 'online' > > and it we get: > > udevd-work[2421]: error opening ATTR{/sys/devices/system/cpu/cpu2/online} for writing: No such file or directory > > > > Line 253 creates said 'online' and at that time udev [or the system admin] > > can write 1 to 'online' and the CPU goes up. > > > > So .. any thoughts? Is there some way to inhibit from uevent being sent > > until line 253 has run? " Signed-off-by: Igor Mammedov <imammedo@redhat.com> Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-21cpu: fix "crash_notes" and "crash_notes_size" leaks in register_cpu()Igor Mammedov
"crash_notes" and "crash_notes_size" are dynamically created with device_create_file() but aren't deleted anywhere. Define "crash_notes" and "crash_notes_size" statically via attribute groups so that device_register would create them automatically and files would be destroyed when CPU is destroyed. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-21base/core.c: improve comment of the function device_find_child()Federico Vaga
Signed-off-by: Federico Vaga <federico.vaga@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-21driver core: print sysfs attribute name when warning about bogus permissionsdyoung@redhat.com
Make it obvious to see what attribute is using bogus permissions. Signed-off-by: Dave Young <dyoung@redhat.com> Acked-by: Felipe Balbi <balbi@ti.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-21driver core: export subsys_virtual_registerGreg Kroah-Hartman
Modules want to call this function, so it needs to be exported. Reported-by: Daniel Mack <zonque@gmail.com> Cc: Kay Sievers <kay@vrfy.org> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-14regmap: debugfs: Fix return from regmap_debugfs_get_dump_startSrinivas Kandagatla
regmap_debugfs_get_dump_start should return the offset of the register it should start reading from, However in the current code at one point the code does not return correct register offset. With this patch all the returns from this function takes reg_stride in to consideration to return correct offset. Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@st.com> Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2013-05-12regmap: debugfs: Don't mark lockdep as broken due to debugfs writeMark Brown
A register write to hardware is reasonably unlikely to cause locking dependency issues, the reason we're tainting is that unexpected changes in the hardware configuration may confuse drivers. Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2013-05-12regmap: debugfs: Check return value of regmap_write()Dimitris Papastamos
Signed-off-by: Dimitris Papastamos <dp@opensource.wolfsonmicro.com> Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2013-05-12regmap: rbtree: Use range information to allocate nodesMark Brown
If range information has been provided then when we allocate a rbnode within a range allocate the entire range. The goal is to minimise the number of reallocations done when combining or extending blocks. At present only readability and yes_ranges are taken into account, this is expected to cover most cases efficiently. Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2013-05-12regmap: rbtree: Factor out node allocationMark Brown
In preparation for being slightly smarter about how we allocate memory factor out the node allocation. Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2013-05-12regmap: Make regmap_check_range_table() a public APIMark Brown
Allow drivers to use an access table as part of their implementation. Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2013-05-12regmap: Add support for discarding parts of the register cacheMark Brown
Allow drivers to discard parts of the register cache, for example if part of the hardware has been reset. Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2013-05-12Driver core: Introduce offline/online callbacks for memory blocksRafael J. Wysocki
Introduce .offline() and .online() callbacks for memory_subsys that will allow the generic device_offline() and device_online() to be used with device objects representing memory blocks. That, in turn, allows the ACPI subsystem to use device_offline() to put removable memory blocks offline, if possible, before removing memory modules holding them. The 'online' sysfs attribute of memory block devices will attempt to put them offline if 0 is written to it and will attempt to apply the previously used online type when onlining them (i.e. when 1 is written to it). Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Toshi Kani <toshi.kani@hp.com>
2013-05-12ACPI / processor: Use common hotplug infrastructureRafael J. Wysocki
Split the ACPI processor driver into two parts, one that is non-modular, resides in the ACPI core and handles the enumeration and hotplug of processors and one that implements the rest of the existing processor driver functionality. The non-modular part uses an ACPI scan handler object to enumerate processors on the basis of information provided by the ACPI namespace and to hook up with the common ACPI hotplug infrastructure. It also populates the ACPI handle of each processor device having a corresponding object in the ACPI namespace, which allows the driver proper to bind to those devices, and makes the driver bind to them if it is readily available (i.e. loaded) when the scan handler's .attach() routine is running. There are a few reasons to make this change. First, switching the ACPI processor driver to using the common ACPI hotplug infrastructure reduces code duplication and size considerably, even though a new file is created along with a header comment etc. Second, since the common hotplug code attempts to offline devices before starting the (non-reversible) removal procedure, it will abort (and possibly roll back) hot-remove operations involving processors if cpu_down() returns an error code for one of them instead of continuing them blindly (if /sys/firmware/acpi/hotplug/force_remove is unset). That is a more desirable behavior than what the current code does. Finally, the separation of the scan/hotplug part from the driver proper makes it possible to simplify the driver's .remove() routine, because it doesn't need to worry about the possible cleanup related to processor removal any more (the scan/hotplug part is responsible for that now) and can handle device removal and driver removal symmetricaly (i.e. as appropriate). Some user-visible changes in sysfs are made (for example, the 'sysdev' link from the ACPI device node to the processor device's directory is gone and a 'physical_node' link is present instead and a corresponding 'firmware_node' is present in the processor device's directory, the processor driver is now visible under /sys/bus/cpu/drivers/ and bound to the processor device), but that shouldn't affect the functionality that users care about (frequency scaling, C-states and thermal management). Tested on my venerable Toshiba Portege R500. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Toshi Kani <toshi.kani@hp.com>
2013-05-12Driver core: Use generic offline/online for CPU offline/onlineRafael J. Wysocki
Rework the CPU hotplug code in drivers/base/cpu.c to use the generic offline/online support introduced previously instead of its own CPU-specific code. For this purpose, modify cpu_subsys to provide offline and online callbacks for CONFIG_HOTPLUG_CPU set and remove the code handling the CPU-specific 'online' sysfs attribute. This modification is not supposed to change the user-observable behavior of the kernel (i.e. the 'online' attribute will be present in exactly the same place in sysfs and should trigger exactly the same actions as before). Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Toshi Kani <toshi.kani@hp.com>
2013-05-12Driver core: Add offline/online device operationsRafael J. Wysocki
In some cases, graceful hot-removal of devices is not possible, although in principle the devices in question support hotplug. For example, that may happen for the last CPU in the system or for memory modules holding kernel memory. In those cases it is nice to be able to check if the given device can be gracefully hot-removed before triggering a removal procedure that cannot be aborted or reversed. Unfortunately, however, the kernel currently doesn't provide any support for that. To address that deficiency, introduce support for offline and online operations that can be performed on devices, respectively, before a hot-removal and in case when it is necessary (or convenient) to put a device back online after a successful offline (that has not been followed by removal). The idea is that the offline will fail whenever the given device cannot be gracefully removed from the system and it will not be allowed to use the device after a successful offline (until a subsequent online) in analogy with the existing CPU offline/online mechanism. For now, the offline and online operations are introduced at the bus type level, as that should be sufficient for the most urgent use cases (CPUs and memory modules). In the future, however, the approach may be extended to cover some more complicated device offline/online scenarios involving device drivers etc. The lock_device_hotplug() and unlock_device_hotplug() functions are introduced because subsequent patches need to put larger pieces of code under device_hotplug_lock to prevent race conditions between device offline and removal from happening. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Toshi Kani <toshi.kani@hp.com>
2013-05-12PM: Avoid calling kfree() under spinlock in dev_pm_put_subsys_data()Shuah Khan
Fix dev_pm_put_subsys_data() so that it doesn't call kfree() under a spinlock and make it return 1 whenever it leaves NULL power.subsys_data (regardless of the reason). Signed-off-by: Shuah Khan <shuah.kh@samsung.com> Reviewed-by: Pavel Machek <pavel@ucw.cz> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-05-01Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input updates from Dmitry Torokhov: "Assorted fixes and cleanups to the existing drivers plus a new driver for IMS Passenger Control Unit device they use for ther in-flight entertainment system." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (44 commits) Input: trackpoint - Optimize trackpoint init to use power-on reset Input: apbps2 - convert to devm_ioremap_resource() Input: ALPS - use %ph to print buffers ARM - shmobile: Armadillo800EVA: Move st1232 reset pin handling Input: st1232 - add reset pin handling Input: st1232 - convert to devm_* infrastructure Input: MT - handle semi-mt devices in core Input: adxl34x - use spi_get_drvdata() Input: ad7877 - use spi_get_drvdata() and spi_set_drvdata() Input: ads7846 - use spi_get_drvdata() and spi_set_drvdata() Input: ims-pcu - fix a memory leak on error Input: sysrq - supplement reset sequence with timeout functionality Input: tegra-kbc - support for defining row/columns based on SoC Input: imx_keypad - switch to using managed resources Input: arc_ps2 - add support for device tree Input: mma8450 - fix signed 12bits to 32bits conversion Input: eeti_ts - remove redundant null check Input: edt-ft5x06 - remove redundant null check before kfree Input: ad714x - add CONFIG_PM_SLEEP to suspend/resume functions Input: adxl34x - add CONFIG_PM_SLEEP to suspend/resume functions ...
2013-05-01dma-buf: Add debugfs supportSumit Semwal
Add debugfs support to make it easier to print debug information about the dma-buf buffers. Cc: Dave Airlie <airlied@redhat.com> [minor fixes on init and warning fix] Cc: Dan Carpenter <dan.carpenter@oracle.com> [remove double unlock in fail case] Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
2013-05-01dma-buf: replace dma_buf_export() with dma_buf_export_named()Sumit Semwal
For debugging purposes, it is useful to have a name-string added while exporting buffers. Hence, dma_buf_export() is replaced with dma_buf_export_named(), which additionally takes 'exp_name' as a parameter. For backward compatibility, and for lazy exporters who don't wish to name themselves, a #define dma_buf_export() is also made available, which adds a __FILE__ instead of 'exp_name'. Cc: Daniel Vetter <daniel.vetter@ffwll.ch> [Thanks for the idea!] Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
2013-04-30Merge tag 'pm+acpi-3.10-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management and ACPI updates from Rafael J Wysocki: - ARM big.LITTLE cpufreq driver from Viresh Kumar. - exynos5440 cpufreq driver from Amit Daniel Kachhap. - cpufreq core cleanup and code consolidation from Viresh Kumar and Stratos Karafotis. - cpufreq scalability improvement from Nathan Zimmer. - AMD "frequency sensitivity feedback" powersave bias for the ondemand cpufreq governor from Jacob Shin. - cpuidle code consolidation and cleanups from Daniel Lezcano. - ARM OMAP cpuidle fixes from Santosh Shilimkar and Daniel Lezcano. - ACPICA fixes and other improvements from Bob Moore, Jung-uk Kim, Lv Zheng, Yinghai Lu, Tang Chen, Colin Ian King, and Linn Crosetto. - ACPI core updates related to hotplug from Toshi Kani, Paul Bolle, Yasuaki Ishimatsu, and Rafael J Wysocki. - Intel Lynxpoint LPSS (Low-Power Subsystem) support improvements from Rafael J Wysocki and Andy Shevchenko. * tag 'pm+acpi-3.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (192 commits) cpufreq: Revert incorrect commit 5800043 cpufreq: MAINTAINERS: Add co-maintainer cpuidle: add maintainer entry ACPI / thermal: do not always return THERMAL_TREND_RAISING for active trip points ARM: s3c64xx: cpuidle: use init/exit common routine cpufreq: pxa2xx: initialize variables ACPI: video: correct acpi_video_bus_add error processing SH: cpuidle: use init/exit common routine ARM: S5pv210: compiling issue, ARM_S5PV210_CPUFREQ needs CONFIG_CPU_FREQ_TABLE=y ACPI: Fix wrong parameter passed to memblock_reserve cpuidle: fix comment format pnp: use %*phC to dump small buffers isapnp: remove debug leftovers ARM: imx: cpuidle: use init/exit common routine ARM: davinci: cpuidle: use init/exit common routine ARM: kirkwood: cpuidle: use init/exit common routine ARM: calxeda: cpuidle: use init/exit common routine ARM: tegra: cpuidle: use init/exit common routine for tegra3 ARM: tegra: cpuidle: use init/exit common routine for tegra2 ARM: OMAP4: cpuidle: use init/exit common routine ...
2013-04-30Merge branch 'for-3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wqLinus Torvalds
Pull workqueue updates from Tejun Heo: "A lot of activities on workqueue side this time. The changes achieve the followings. - WQ_UNBOUND workqueues - the workqueues which are per-cpu - are updated to be able to interface with multiple backend worker pools. This involved a lot of churning but the end result seems actually neater as unbound workqueues are now a lot closer to per-cpu ones. - The ability to interface with multiple backend worker pools are used to implement unbound workqueues with custom attributes. Currently the supported attributes are the nice level and CPU affinity. It may be expanded to include cgroup association in future. The attributes can be specified either by calling apply_workqueue_attrs() or through /sys/bus/workqueue/WQ_NAME/* if the workqueue in question is exported through sysfs. The backend worker pools are keyed by the actual attributes and shared by any workqueues which share the same attributes. When attributes of a workqueue are changed, the workqueue binds to the worker pool with the specified attributes while leaving the work items which are already executing in its previous worker pools alone. This allows converting custom worker pool implementations which want worker attribute tuning to use workqueues. The writeback pool is already converted in block tree and there are a couple others are likely to follow including btrfs io workers. - WQ_UNBOUND's ability to bind to multiple worker pools is also used to make it NUMA-aware. Because there's no association between work item issuer and the specific worker assigned to execute it, before this change, using unbound workqueue led to unnecessary cross-node bouncing and it couldn't be helped by autonuma as it requires tasks to have implicit node affinity and workers are assigned randomly. After these changes, an unbound workqueue now binds to multiple NUMA-affine worker pools so that queued work items are executed in the same node. This is turned on by default but can be disabled system-wide or for individual workqueues. Crypto was requesting NUMA affinity as encrypting data across different nodes can contribute noticeable overhead and doing it per-cpu was too limiting for certain cases and IO throughput could be bottlenecked by one CPU being fully occupied while others have idle cycles. While the new features required a lot of changes including restructuring locking, it didn't complicate the execution paths much. The unbound workqueue handling is now closer to per-cpu ones and the new features are implemented by simply associating a workqueue with different sets of backend worker pools without changing queue, execution or flush paths. As such, even though the amount of change is very high, I feel relatively safe in that it isn't likely to cause subtle issues with basic correctness of work item execution and handling. If something is wrong, it's likely to show up as being associated with worker pools with the wrong attributes or OOPS while workqueue attributes are being changed or during CPU hotplug. While this creates more backend worker pools, it doesn't add too many more workers unless, of course, there are many workqueues with unique combinations of attributes. Assuming everything else is the same, NUMA awareness costs an extra worker pool per NUMA node with online CPUs. There are also a couple things which are being routed outside the workqueue tree. - block tree pulled in workqueue for-3.10 so that writeback worker pool can be converted to unbound workqueue with sysfs control exposed. This simplifies the code, makes writeback workers NUMA-aware and allows tuning nice level and CPU affinity via sysfs. - The conversion to workqueue means that there's no 1:1 association between a specific worker, which makes writeback folks unhappy as they want to be able to tell which filesystem caused a problem from backtrace on systems with many filesystems mounted. This is resolved by allowing work items to set debug info string which is printed when the task is dumped. As this change involves unifying implementations of dump_stack() and friends in arch codes, it's being routed through Andrew's -mm tree." * 'for-3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: (84 commits) workqueue: use kmem_cache_free() instead of kfree() workqueue: avoid false negative WARN_ON() in destroy_workqueue() workqueue: update sysfs interface to reflect NUMA awareness and a kernel param to disable NUMA affinity workqueue: implement NUMA affinity for unbound workqueues workqueue: introduce put_pwq_unlocked() workqueue: introduce numa_pwq_tbl_install() workqueue: use NUMA-aware allocation for pool_workqueues workqueue: break init_and_link_pwq() into two functions and introduce alloc_unbound_pwq() workqueue: map an unbound workqueues to multiple per-node pool_workqueues workqueue: move hot fields of workqueue_struct to the end workqueue: make workqueue->name[] fixed len workqueue: add workqueue->unbound_attrs workqueue: determine NUMA node of workers accourding to the allowed cpumask workqueue: drop 'H' from kworker names of unbound worker pools workqueue: add wq_numa_tbl_len and wq_numa_possible_cpumask[] workqueue: move pwq_pool_locking outside of get/put_unbound_pool() workqueue: fix memory leak in apply_workqueue_attrs() workqueue: fix unbound workqueue attrs hashing / comparison workqueue: fix race condition in unbound workqueue free path workqueue: remove pwq_lock which is no longer used ...
2013-04-30Merge branch 'akpm' (incoming from Andrew)Linus Torvalds
Merge first batch of fixes from Andrew Morton: - A couple of kthread changes - A few minor audit patches - A number of fbdev patches. Florian remains AWOL so I'm picking up some of these. - A few kbuild things - ocfs2 updates - Almost all of the MM queue (And in the meantime, I already have the second big batch from Andrew pending in my mailbox ;^) * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (149 commits) memcg: take reference before releasing rcu_read_lock mem hotunplug: fix kfree() of bootmem memory mmKconfig: add an option to disable bounce mm, nobootmem: do memset() after memblock_reserve() mm, nobootmem: clean-up of free_low_memory_core_early() fs/buffer.c: remove unnecessary init operation after allocating buffer_head. numa, cpu hotplug: change links of CPU and node when changing node number by onlining CPU mm: fix memory_hotplug.c printk format warning mm: swap: mark swap pages writeback before queueing for direct IO swap: redirty page if page write fails on swap file mm, memcg: give exiting processes access to memory reserves thp: fix huge zero page logic for page with pfn == 0 memcg: avoid accessing memcg after releasing reference fs: fix fsync() error reporting memblock: fix missing comment of memblock_insert_region() mm: Remove unused parameter of pages_correctly_reserved() firmware, memmap: fix firmware_map_entry leak mm/vmstat: add note on safety of drain_zonestat mm: thp: add split tail pages to shrink page list in page reclaim mm: allow for outstanding swap writeback accounting ...
2013-04-29Merge tag 'regmap-v3.10' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap Pull regmap updates from Mark Brown: "In user visible terms just a couple of enhancements here, though there was a moderate amount of refactoring required in order to support the register cache sync performance improvements. - Support for block and asynchronous I/O during register cache syncing; this provides a use case dependant performance improvement. - Additional debugfs information on the memory consuption and register set" * tag 'regmap-v3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap: (23 commits) regmap: don't corrupt work buffer in _regmap_raw_write() regmap: cache: Fix format specifier in dev_dbg regmap: cache: Make regcache_sync_block_raw static regmap: cache: Write consecutive registers in a single block write regmap: cache: Split raw and non-raw syncs regmap: cache: Factor out block sync regmap: cache: Factor out reg_present support from rbtree cache regmap: cache: Use raw I/O to sync rbtrees if we can regmap: core: Provide regmap_can_raw_write() operation regmap: cache: Provide a get address of value operation regmap: Cut down on the average # of nodes in the rbtree cache regmap: core: Make raw write available to regcache regmap: core: Warn on invalid operation combinations regmap: irq: Clarify error message when we fail to request primary IRQ regmap: rbtree Expose total memory consumption in the rbtree debugfs entry regmap: debugfs: Add a registers `range' file regmap: debugfs: Simplify calculation of `c->max_reg' regmap: cache: Store caches in native register format where possible regmap: core: Split out in place value parsing regmap: cache: Use regcache_get_value() to check if we updated ...
2013-04-29numa, cpu hotplug: change links of CPU and node when changing node number by ↵Yasuaki Ishimatsu
onlining CPU When booting x86 system contains memoryless node, node numbers of CPUs on memoryless node were changed to nearest online node number by init_cpu_to_node() because the node is not online. In my system, node numbers of cpu#30-44 and 75-89 were changed from 2 to 0 as follows: $ numactl --hardware available: 2 nodes (0-1) node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 node 0 size: 32394 MB node 0 free: 27898 MB node 1 cpus: 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 node 1 size: 32768 MB node 1 free: 30335 MB If we hot add memory to memoryless node and offine/online all CPUs on the node, node numbers of these CPUs are changed to correct node numbers by srat_detect_node() because the node become online. In this case, node numbers of cpu#30-44 and 75-89 were changed from 0 to 2 in my system as follows: $ numactl --hardware available: 3 nodes (0-2) node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 node 0 size: 32394 MB node 0 free: 27218 MB node 1 cpus: 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 node 1 size: 32768 MB node 1 free: 30014 MB node 2 cpus: 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 node 2 size: 16384 MB node 2 free: 16384 MB But "cpu to node" and "node to cpu" links were not changed as follows: $ ls /sys/devices/system/cpu/cpu30/|grep node node0 $ ls /sys/devices/system/node/node0/|grep cpu30 cpu30 "numactl --hardware" shows that cpu30 belongs to node 2. But sysfs links does not change. This patch changes "cpu to node" and "node to cpu" links when node number changed by onlining CPU. Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-29mm: Remove unused parameter of pages_correctly_reserved()Tang Chen
nr_pages is not used in pages_correctly_reserved(). So remove it. Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> Reviewed-by: Wang Shilong <wangsl-fnst@cn.fujitsu.com> Reviewed-by: Wen Congyang <wency@cn.fujitsu.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-29mm, hotplug: avoid compiling memory hotremove functions when disabledDavid Rientjes
__remove_pages() is only necessary for CONFIG_MEMORY_HOTREMOVE. PowerPC pseries will return -EOPNOTSUPP if unsupported. Adding an #ifdef causes several other functions it depends on to also become unnecessary, which saves in .text when disabled (it's disabled in most defconfigs besides powerpc, including x86). remove_memory_block() becomes static since it is not referenced outside of drivers/base/memory.c. Build tested on x86 and powerpc with CONFIG_MEMORY_HOTREMOVE both enabled and disabled. Signed-off-by: David Rientjes <rientjes@google.com> Acked-by: Toshi Kani <toshi.kani@hp.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Wen Congyang <wency@cn.fujitsu.com> Cc: Tang Chen <tangchen@cn.fujitsu.com> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-29drivers/base/node.c: switch to register_hotmemory_notifier()Andrew Morton
Squishes a warning which my change to hotplug_memory_notifier() added. I want to keep that warning, because it is punishment for failnig to check the hotplug_memory_notifier() return value. Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-27Merge branch 'pm-assorted'Rafael J. Wysocki
* pm-assorted: PM / OPP: add documentation to RCU head in struct opp PM / sleep: invalidate TEST_CPUS and TEST_CORE support for freeze state PM / sleep: add TEST_PLATFORM support for freeze state
2013-04-27Merge branch 'pm-runtime'Rafael J. Wysocki
* pm-runtime: PM / Runtime: Improve prepare handling at system suspend for genpd PM / Runtime: Asyncronous idle|suspend parent devices at removal PM / Runtime: Asyncronous idle|suspend devices at system resume
2013-04-16Merge remote-tracking branch 'regmap/topic/range' into regmap-nextMark Brown
2013-04-16Merge remote-tracking branch 'regmap/topic/irq' into regmap-nextMark Brown
2013-04-16Merge remote-tracking branch 'regmap/topic/cache' into regmap-nextMark Brown
2013-04-16Merge remote-tracking branch 'regmap/topic/async' into regmap-nextMark Brown