summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2012-02-06mac80211: support hw scan while idleEliad Peller
Currently, mac80211 goes to idle-off before starting a scan. However, some devices that implement hw scan might not need going idle-off in order to perform a hw scan, and thus saving some energy and simplifying their state machine. (Note that this is also the case for sched scan - it currently doesn't make mac80211 go idle-off) Add a new flag to indicate support for hw scan while idle. Signed-off-by: Eliad Peller <eliad@wizery.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-02-06cfg80211/mac80211: userspace peer authorization in IBSSAntonio Quartulli
If the IBSS network is RSN-protected, let userspace authorize the stations instead of adding them as AUTHORIZED by default. Signed-off-by: Antonio Quartulli <ordex@autistici.org> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-02-06bcma: add extra sprom checkHauke Mehrtens
This check is needed on the BCM43224 device as it says in the capabilities it has an sprom but is extra check says it has not. Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-02-06bcma: add bus num counterHauke Mehrtens
If we have two bcma buses on one computer the second will not work without this patch. Now each bus gets an own number. Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-02-06bcma: add PCIe host controllerHauke Mehrtens
Some SoCs have a PCIe host controller to make it possible to attach some other devices to it, like an other Wifi card. This code was tested with an Netgear WNDR3400 (bcm4716 based), but should work with all bcma based SoCs. Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-02-06bcma: make some functions __devinitHauke Mehrtens
bcma_core_pci_hostmode_init() has to be in __devinit as it will call a function in that section and so all functions calling it also have to be in __devinit. Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-02-06bcma: add constants for PCI and use themHauke Mehrtens
There are many magic numbers used in the PCIe code. Replace them with some constants from the Broadcom SDK and also use them in the pcie host controller. Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-02-06bcma: add the core unit numberHauke Mehrtens
Some SoCs have two pcie or gmac cores and we need to know the number of the specific core on the bus. This is the case for the BCM4706. Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-02-06cfg80211: export cfg80211_ref_bssJohannes Berg
This is needed by mac80211 to keep a reference to a BSS alive for the auth process. Remove the old version of cfg80211_ref_bss() since it's not actually used. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-02-06cfg80211: stop tracking authenticated stateJohannes Berg
To track authenticated state seems to have been a design mistake in cfg80211. It is possible to have out of band authentication (FT), tracking multiple authentications caused more problems than it ever helped, and the implementation in mac80211 is too complex. Remove all this complexity, and let userspace do whatever it wants to, mac80211 can deal with that just fine. Association is still tracked of course, but authentication no longer is. Local auth state changes are thus no longer of value, so ignore them completely. This will also help implement SAE -- asking the driver to do an authentication is now almost equivalent to sending an authentication frame, with the exception of shared key authentication which is still handled completely. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-02-06mac80211: add sta_state callbackJohannes Berg
(based on Eliad's patch) Add a callback to notify the low-level driver whenever the state of a station changes. The driver is only notified when the station is actually in the mac80211 hash table, not for pre-insert state transitions. To allow the driver to replace sta_add/remove calls with this, call extra transitions with the NOTEXIST state. This callback can fail, so we need to be careful in handling it when a station is inserted, particularly in the IBSS case where we still keep the station entry around for mac80211 purposes. Signed-off-by: Eliad Peller <eliad@wizery.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-01-30mac80211: add support for mcs masksSimon Wunderlich
* Handle MCS masks set by the user. * Match rates provided by the rate control algorithm to the mask set, also in HT mode, and switch back to legacy mode if necessary. * add debugfs files to observate the rate selection Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Mathias Kretschmer <mathias.kretschmer@fokus.fraunhofer.de> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-01-30nl80211: add support for mcs masksSimon Wunderlich
Allow to set mcs masks through nl80211. We also allow to set MCS rates but no legacy rates (and vice versa). Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Mathias Kretschmer <mathias.kretschmer@fokus.fraunhofer.de> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-01-27kernel-doc: fix new warnings in cfg80211.hRandy Dunlap
Fix new kernel-doc warnings: Warning(include/net/cfg80211.h:1165): No description found for parameter 'channel_type' Warning(include/net/cfg80211.h:2090): No description found for parameter 'probe_resp_offload' Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: linux-wireless@vger.kernel.org Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-01-27{nl,cfg,mac}80211: Add support of setting non-forwarding entity in MeshChun-Yeow Yeoh
A mesh node that joins the mesh network is by default a forwarding entity. This patch allows the mesh node to set as non-forwarding entity. Whenever dot11MeshForwarding is set to 0, the mesh node can prevent itself from forwarding the traffic which is not destined to him. Signed-off-by: Chun-Yeow Yeoh <yeohchunyeow@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-01-27mac80211: make CQM RSSI support per virtual interfaceJohannes Berg
Similar to the previous beacon filtering patch, make CQM RSSI support depend on the flags that the driver set for virtual interfaces. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Acked-by: Luciano Coelho <coelho@ti.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-01-27mac80211: make beacon filtering per virtual interfaceJohannes Berg
Due to firmware limitations, we may not be able to support beacon filtering on all virtual interfaces. To allow this in mac80211, introduce per-interface driver capability flags that the driver sets when an interface is added. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Acked-by: Luciano Coelho <coelho@ti.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-01-24NFC: Add NCI multiple targets supportIlan Elias
Add the ability to select between multiple targets in NCI. If only one target is found, it will be auto-activated. If more than one target is found, then DISCOVER_NTF will be generated for each target, and the host should select one by calling DISCOVER_SELECT_CMD. Then, the target will be activated. If the activation fails, GENERIC_ERROR_NTF is generated. Signed-off-by: Ilan Elias <ilane@ti.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-01-24NFC: NFC core layer should not set the target_idxIlan Elias
The NFC core layer should not set the target_idx. Instead, the driver layer (e.g. NCI, PN533) should set the target_idx, so that it will be able to identify the target when its I/F (e.g. activate_target) is called. This is required in order to support multiple targets. Note that currently supported drivers (PN533 and NCI) don't use the target_idx in their implementation. Signed-off-by: Ilan Elias <ilane@ti.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-01-24NFC: Clearly separate NCI states from flagsIlan Elias
Make a clear separation between NCI states and flags. This is required in order to support more NCI states (e.g. for multiple targets support). Signed-off-by: Ilan Elias <ilane@ti.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-01-24NFC: Add NCI data exchange timerIlan Elias
Add NCI data exchange timer to catch timeouts, and call the data exchange callback with an error. Signed-off-by: Ilan Elias <ilane@ti.com> Acked-by: Samuel Ortiz <sameo@linux.intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-01-24NFC: Export new attributes sensb_res and sensf_resIlan Elias
Export new attributes sensb_res for tech B and sensf_res for tech F in the target info (returned as a response to NFC_CMD_GET_TARGET). The max size of the attributes nfcid1, sensb_res and sensf_res is exported to user space though include/linux/nfc. Signed-off-by: Ilan Elias <ilane@ti.com> Acked-by: Samuel Ortiz <sameo@linux.intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-01-24wireless: Save original maximum regulatory transmission power for the ↵Hong Wu
calucation of the local maximum transmit power The local maximum transmit power is the maximum power a wireless device allowed to transmit. If Power Constraint is presented, the local maximum power equals to the maximum allowed power defined in regulatory domain minus power constraint. The maximum transmit power is maximum power a wireless device capable of transmitting, and should be used in Power Capability element (7.3.2.16 IEEE802.11 2007). The transmit power from a wireless device should not greater than the local maximum transmit power. The maximum transmit power was not calculated correctly in the current Linux wireless/mac80211 when Power Constraint is presented. Signed-off-by: Hong Wu <hong.wu@dspg.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-01-24NFC: Increase NCI deactivate timeoutIlan Elias
Increase NCI deactivate timeout from 5 sec to 30 sec. NCI deactivate procedure might take a long time, depending on the local and remote parameters. Signed-off-by: Ilan Elias <ilane@ti.com> Acked-by: Samuel Ortiz <sameo@linux.intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-01-24ssb: SPROM: extract each core power infoRafał Miłecki
We already extract some basic info but it's incomplete, reads info about the first core only. Used data structure doesn't allow easy adding of more cores. This patch adds new struct and array for storing power info. The plan is to: switch all extractors (including the ones using NVRAM) to new struct, switch drivers, then deprecate and finally drop old SSB fields. Signed-off-by: Rafał Miłecki <zajec5@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-01-17Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless
2012-01-17net: fix some sparse errorsEric Dumazet
make C=2 CF="-D__CHECK_ENDIAN__" M=net And fix flowi4_init_output() prototype for sport Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-01-17bcma: connect the bcma bus suspend/resume to the bcma driver suspend/resumeLinus Torvalds
Now the low-level driver actually gets informed that it is getting suspended and resumed. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-01-16netfilter: revert user-space expectation helper supportPablo Neira Ayuso
This patch partially reverts: 3d058d7 netfilter: rework user-space expectation helper support that was applied during the 3.2 development cycle. After this patch, the tree remains just like before patch bc01bef, that initially added the preliminary infrastructure. I decided to partially revert this patch because the approach that I proposed to resolve this problem is broken in NAT setups. Moreover, a new infrastructure will be submitted for the 3.3.x development cycle that resolve the existing issues while providing a neat solution. Since nobody has been seriously using this infrastructure in user-space, the removal of this feature should affect any know FOSS project (to my knowledge). Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-01-13vhost-net: add module alias (v2.1)stephen hemminger
By adding some module aliases, programs (or users) won't have to explicitly call modprobe. Vhost-net will always be available if built into the kernel. It does require assigning a permanent minor number for depmod to work. Also: - use C99 style initialization. - add missing entry in documentation for loop-control Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Acked-By: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-01-13Merge branch 'akpm' (aka "Andrew's patch-bomb, take two")Linus Torvalds
Andrew explains: - various misc stuff - Most of the rest of MM: memcg, threaded hugepages, others. - cpumask - kexec - kdump - some direct-io performance tweaking - radix-tree optimisations - new selftests code A note on this: often people will develop a new userspace-visible feature and will develop userspace code to exercise/test that feature. Then they merge the patch and the selftest code dies. Sometimes we paste it into the changelog. Sometimes the code gets thrown into Documentation/(!). This saddens me. So this patch creates a bare-bones framework which will henceforth allow me to ask people to include their test apps in the kernel tree so we can keep them alive. Then when people enhance or fix the feature, I can ask them to update the test app too. The infrastruture is terribly trivial at present - let's see how it evolves. - checkpoint/restart feature work. A note on this: this is a project by various mad Russians to perform c/r mainly from userspace, with various oddball helper code added into the kernel where the need is demonstrated. So rather than some large central lump of code, what we have is little bits and pieces popping up in various places which either expose something new or which permit something which is normally kernel-private to be modified. The overall project is an ongoing thing. I've judged that the size and scope of the thing means that we're more likely to be successful with it if we integrate the support into mainline piecemeal rather than allowing it all to develop out-of-tree. However I'm less confident than the developers that it will all eventually work! So what I'm asking them to do is to wrap each piece of new code inside CONFIG_CHECKPOINT_RESTORE. So if it all eventually comes to tears and the project as a whole fails, it should be a simple matter to go through and delete all trace of it. This lot pretty much wraps up the -rc1 merge for me. * akpm: (96 commits) unlzo: fix input buffer free ramoops: update parameters only after successful init ramoops: fix use of rounddown_pow_of_two() c/r: prctl: add PR_SET_MM codes to set up mm_struct entries c/r: procfs: add start_data, end_data, start_brk members to /proc/$pid/stat v4 c/r: introduce CHECKPOINT_RESTORE symbol selftests: new x86 breakpoints selftest selftests: new very basic kernel selftests directory radix_tree: take radix_tree_path off stack radix_tree: remove radix_tree_indirect_to_ptr() dio: optimize cache misses in the submission path vfs: cache request_queue in struct block_device fs/direct-io.c: calculate fs_count correctly in get_more_blocks() drivers/parport/parport_pc.c: fix warnings panic: don't print redundant backtraces on oops sysctl: add the kernel.ns_last_pid control kdump: add udev events for memory online/offline include/linux/crash_dump.h needs elf.h kdump: fix crash_kexec()/smp_send_stop() race in panic() kdump: crashk_res init check for /sys/kernel/kexec_crash_size ...
2012-01-13Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (69 commits) pptp: Accept packet with seq zero RDS: Remove some unused iWARP code net: fsl: fec: handle 10Mbps speed in RMII mode drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c: add missing iounmap drivers/net/ethernet/tundra/tsi108_eth.c: add missing iounmap ksz884x: fix mtu for VLAN net_sched: sfq: add optional RED on top of SFQ dp83640: Fix NOHZ local_softirq_pending 08 warning gianfar: Fix invalid TX frames returned on error queue when time stamping gianfar: Fix missing sock reference when processing TX time stamps phylib: introduce mdiobus_alloc_size() net: decrement memcg jump label when limit, not usage, is changed net: reintroduce missing rcu_assign_pointer() calls inet_diag: Rename inet_diag_req_compat into inet_diag_req inet_diag: Rename inet_diag_req into inet_diag_req_v2 bond_alb: don't disable softirq under bond_alb_xmit mac80211: fix rx->key NULL pointer dereference in promiscuous mode nl80211: fix old station flags compatibility mdio-octeon: use an unique MDIO bus name. mdio-gpio: use an unique MDIO bus name. ...
2012-01-13c/r: prctl: add PR_SET_MM codes to set up mm_struct entriesCyrill Gorcunov
When we restore a task we need to set up text, data and data heap sizes from userspace to the values a task had at checkpoint time. This patch adds auxilary prctl codes for that. While most of them have a statistical nature (their values are involved into calculation of /proc/<pid>/statm output) the start_brk and brk values are used to compute an allowed size of program data segment expansion. Which means an arbitrary changes of this values might be dangerous operation. So to restrict access the following requirements applied to prctl calls: - The process has to have CAP_SYS_ADMIN capability granted. - For all opcodes except start_brk/brk members an appropriate VMA area must exist and should fit certain VMA flags, such as: - code segment must be executable but not writable; - data segment must not be executable. start_brk/brk values must not intersect with data segment and must not exceed RLIMIT_DATA resource limit. Still the main guard is CAP_SYS_ADMIN capability check. Note the kernel should be compiled with CONFIG_CHECKPOINT_RESTORE support otherwise these prctl calls will return -EINVAL. [akpm@linux-foundation.org: cache current->mm in a local, saving 200 bytes text] Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: Tejun Heo <tj@kernel.org> Cc: Andrew Vagin <avagin@openvz.org> Cc: Serge Hallyn <serge.hallyn@canonical.com> Cc: Pavel Emelyanov <xemul@parallels.com> Cc: Vasiliy Kulikov <segoon@openwall.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-01-13radix_tree: remove radix_tree_indirect_to_ptr()Xiao Guangrong
It is not used anymore, remove it Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Acked-by: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-01-13vfs: cache request_queue in struct block_deviceAndi Kleen
This makes it possible to get from the inode to the request_queue with one less cache miss. Used in followon optimization. The livetime of the pointer is the same as the gendisk. This assumes that the queue will always stay the same in the gendisk while it's visible to block_devices. I think that's safe correct? Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jeff Moyer <jmoyer@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-01-13include/linux/crash_dump.h needs elf.hFabio Estevam
Building an ARM target we get the following warnings: CC arch/arm/kernel/setup.o In file included from arch/arm/kernel/setup.c:39: arch/arm/include/asm/elf.h:102:1: warning: "vmcore_elf64_check_arch" redefined In file included from arch/arm/kernel/setup.c:24: include/linux/crash_dump.h:30:1: warning: this is the location of the previous definition Quoting Russell King: "linux/crash_dump.h makes no attempt to include asm/elf.h, but it depends on stuff in asm/elf.h to determine how stuff inside this file is defined at parse time. So, if asm/elf.h is included after linux/crash_dump.h or not at all, you get a different result from the situation where asm/elf.h is included before." So add elf.h header to crash_dump.h to avoid this problem. The original discussion about this can be found at: http://www.spinics.net/lists/arm-kernel/msg154113.html Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com> Cc: Russell King <rmk@arm.linux.org.uk> Cc: <stable@vger.kernel.org> [3.2.1] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-01-13kexec: remove KMSG_DUMP_KEXECWANG Cong
KMSG_DUMP_KEXEC is useless because we already save kernel messages inside /proc/vmcore, and it is unsafe to allow modules to do other stuffs in a crash dump scenario. [akpm@linux-foundation.org: fix powerpc build] Signed-off-by: WANG Cong <xiyou.wangcong@gmail.com> Reported-by: Vivek Goyal <vgoyal@redhat.com> Acked-by: Vivek Goyal <vgoyal@redhat.com> Acked-by: Jarod Wilson <jarod@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-01-13mm: remove del_page_from_lru, add page_off_lruHugh Dickins
del_page_from_lru() repeats del_page_from_lru_list(), also working out which LRU the page was on, clearing the relevant bits. Decouple those functions: remove del_page_from_lru() and add page_off_lru(). Signed-off-by: Hugh Dickins <hughd@google.com> Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-01-13mm: enum lru_list lruHugh Dickins
Mostly we use "enum lru_list lru": change those few "l"s to "lru"s. Signed-off-by: Hugh Dickins <hughd@google.com> Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-01-13mm: fewer underscores in ____pagevec_lru_addHugh Dickins
What's so special about ____pagevec_lru_add() that it needs four leading underscores? Nothing, it just helped to distinguish from __pagevec_lru_add() in 2.6.28 development. Cut two leading underscores. Signed-off-by: Hugh Dickins <hughd@google.com> Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-01-13mm: take pagevecs off reclaim stackHugh Dickins
Replace pagevecs in putback_lru_pages() and move_active_pages_to_lru() by lists of pages_to_free: then apply Konstantin Khlebnikov's free_hot_cold_page_list() to them instead of pagevec_release(). Which simplifies the flow (no need to drop and retake lock whenever pagevec fills up) and reduces stale addresses in stack backtraces (which often showed through the pagevecs); but more importantly, removes another 120 bytes from the deepest stacks in page reclaim. Although I've not recently seen an actual stack overflow here with a vanilla kernel, move_active_pages_to_lru() has often featured in deep backtraces. However, free_hot_cold_page_list() does not handle compound pages (nor need it: a Transparent HugePage would have been split by the time it reaches the call in shrink_page_list()), but it is possible for putback_lru_pages() or move_active_pages_to_lru() to be left holding the last reference on a THP, so must exclude the unlikely compound case before putting on pages_to_free. Remove pagevec_strip(), its work now done in move_active_pages_to_lru(). The pagevec in scan_mapping_unevictable_pages() remains in mm/vmscan.c, but that is never on the reclaim path, and cannot be replaced by a list. Signed-off-by: Hugh Dickins <hughd@google.com> Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Reviewed-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Mel Gorman <mel@csn.ul.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-01-13mm: compaction: introduce sync-light migration for use by compactionMel Gorman
This patch adds a lightweight sync migrate operation MIGRATE_SYNC_LIGHT mode that avoids writing back pages to backing storage. Async compaction maps to MIGRATE_ASYNC while sync compaction maps to MIGRATE_SYNC_LIGHT. For other migrate_pages users such as memory hotplug, MIGRATE_SYNC is used. This avoids sync compaction stalling for an excessive length of time, particularly when copying files to a USB stick where there might be a large number of dirty pages backed by a filesystem that does not support ->writepages. [aarcange@redhat.com: This patch is heavily based on Andrea's work] [akpm@linux-foundation.org: fix fs/nfs/write.c build] [akpm@linux-foundation.org: fix fs/btrfs/disk-io.c build] Signed-off-by: Mel Gorman <mgorman@suse.de> Reviewed-by: Rik van Riel <riel@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Minchan Kim <minchan.kim@gmail.com> Cc: Dave Jones <davej@redhat.com> Cc: Jan Kara <jack@suse.cz> Cc: Andy Isaacson <adi@hexapodia.org> Cc: Nai Xia <nai.xia@gmail.com> Cc: Johannes Weiner <jweiner@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-01-13mm: compaction: make isolate_lru_page() filter-aware againMel Gorman
Commit 39deaf85 ("mm: compaction: make isolate_lru_page() filter-aware") noted that compaction does not migrate dirty or writeback pages and that is was meaningless to pick the page and re-add it to the LRU list. This had to be partially reverted because some dirty pages can be migrated by compaction without blocking. This patch updates "mm: compaction: make isolate_lru_page" by skipping over pages that migration has no possibility of migrating to minimise LRU disruption. Signed-off-by: Mel Gorman <mgorman@suse.de> Reviewed-by: Rik van Riel<riel@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Reviewed-by: Minchan Kim <minchan@kernel.org> Cc: Dave Jones <davej@redhat.com> Cc: Jan Kara <jack@suse.cz> Cc: Andy Isaacson <adi@hexapodia.org> Cc: Nai Xia <nai.xia@gmail.com> Cc: Johannes Weiner <jweiner@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-01-13mm: compaction: determine if dirty pages can be migrated without blocking ↵Mel Gorman
within ->migratepage Asynchronous compaction is used when allocating transparent hugepages to avoid blocking for long periods of time. Due to reports of stalling, there was a debate on disabling synchronous compaction but this severely impacted allocation success rates. Part of the reason was that many dirty pages are skipped in asynchronous compaction by the following check; if (PageDirty(page) && !sync && mapping->a_ops->migratepage != migrate_page) rc = -EBUSY; This skips over all mapping aops using buffer_migrate_page() even though it is possible to migrate some of these pages without blocking. This patch updates the ->migratepage callback with a "sync" parameter. It is the responsibility of the callback to fail gracefully if migration would block. Signed-off-by: Mel Gorman <mgorman@suse.de> Reviewed-by: Rik van Riel <riel@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Minchan Kim <minchan.kim@gmail.com> Cc: Dave Jones <davej@redhat.com> Cc: Jan Kara <jack@suse.cz> Cc: Andy Isaacson <adi@hexapodia.org> Cc: Nai Xia <nai.xia@gmail.com> Cc: Johannes Weiner <jweiner@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-01-13vmscan/trace: Add 'file' info to trace_mm_vmscan_lru_isolate()Tao Ma
In trace_mm_vmscan_lru_isolate(), we don't output 'file' information to the trace event and it is a bit inconvenient for the user to get the real information(like pasted below). mm_vmscan_lru_isolate: isolate_mode=2 order=0 nr_requested=32 nr_scanned=32 nr_taken=32 contig_taken=0 contig_dirty=0 contig_failed=0 'active' can be obtained by analyzing mode(Thanks go to Minchan and Mel), So this patch adds 'file' to the trace event and it now looks like: mm_vmscan_lru_isolate: isolate_mode=2 order=0 nr_requested=32 nr_scanned=32 nr_taken=32 contig_taken=0 contig_dirty=0 contig_failed=0 file=0 Signed-off-by: Tao Ma <boyu.mt@taobao.com> Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Mel Gorman <mel@csn.ul.ie> Reviewed-by: Minchan Kim <minchan@kernel.org> Cc: Rik van Riel <riel@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-01-13thp: add tlb_remove_pmd_tlb_entryShaohua Li
We have tlb_remove_tlb_entry to indicate a pte tlb flush entry should be flushed, but not a corresponding API for pmd entry. This isn't a problem so far because THP is only for x86 currently and tlb_flush() under x86 will flush entire TLB. But this is confusion and could be missed if thp is ported to other arch. Also convert tlb->need_flush = 1 to a VM_BUG_ON(!tlb->need_flush) in __tlb_remove_page() as suggested by Andrea Arcangeli. The __tlb_remove_page() function is supposed to be called after tlb_remove_xxx_tlb_entry() and we can catch any misuse. Signed-off-by: Shaohua Li <shaohua.li@intel.com> Reviewed-by: Andrea Arcangeli <aarcange@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: Johannes Weiner <jweiner@redhat.com> Cc: Minchan Kim <minchan.kim@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-01-13memcg: simplify LRU handling by new ruleKAMEZAWA Hiroyuki
Now, at LRU handling, memory cgroup needs to do complicated works to see valid pc->mem_cgroup, which may be overwritten. This patch is for relaxing the protocol. This patch guarantees - when pc->mem_cgroup is overwritten, page must not be on LRU. By this, LRU routine can believe pc->mem_cgroup and don't need to check bits on pc->flags. This new rule may adds small overheads to swapin. But in most case, lru handling gets faster. After this patch, PCG_ACCT_LRU bit is obsolete and removed. [akpm@linux-foundation.org: remove unneeded VM_BUG_ON(), restore hannes's christmas tree] [akpm@linux-foundation.org: clean up code comment] [hughd@google.com: fix NULL mem_cgroup_try_charge] Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Miklos Szeredi <mszeredi@suse.cz> Acked-by: Michal Hocko <mhocko@suse.cz> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Ying Han <yinghan@google.com> Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-01-13memcg: clear pc->mem_cgroup if necessary.KAMEZAWA Hiroyuki
This is a preparation before removing a flag PCG_ACCT_LRU in page_cgroup and reducing atomic ops/complexity in memcg LRU handling. In some cases, pages are added to lru before charge to memcg and pages are not classfied to memory cgroup at lru addtion. Now, the lru where the page should be added is determined a bit in page_cgroup->flags and pc->mem_cgroup. I'd like to remove the check of flag. To handle the case pc->mem_cgroup may contain stale pointers if pages are added to LRU before classification. This patch resets pc->mem_cgroup to root_mem_cgroup before lru additions. [akpm@linux-foundation.org: fix CONFIG_CGROUP_MEM_CONT=n build] [hughd@google.com: fix CONFIG_CGROUP_MEM_RES_CTLR=y CONFIG_CGROUP_MEM_RES_CTLR_SWAP=n build] [akpm@linux-foundation.org: ksm.c needs memcontrol.h, per Michal] [hughd@google.com: stop oops in mem_cgroup_reset_owner()] [hughd@google.com: fix page migration to reset_owner] Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Miklos Szeredi <mszeredi@suse.cz> Acked-by: Michal Hocko <mhocko@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Ying Han <yinghan@google.com> Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-01-13page_cgroup: add helper function to get swap_cgroupBob Liu
There are multiple places which need to get the swap_cgroup address, so add a helper function: static struct swap_cgroup *swap_cgroup_getsc(swp_entry_t ent, struct swap_cgroup_ctrl **ctrl); to simplify the code. Signed-off-by: Bob Liu <lliubbo@gmail.com> Acked-by: Michal Hocko <mhocko@suse.cz> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Johannes Weiner <jweiner@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-01-13mm: unify remaining mem_cont, mem, etc. variable names to memcgJohannes Weiner
Signed-off-by: Johannes Weiner <jweiner@redhat.com> Acked-by: David Rientjes <rientjes@google.com> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: Michal Hocko <mhocko@suse.cz> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>