summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2014-04-09Merge branch 'for-3.15' of git://linux-nfs.org/~bfields/linuxLinus Torvalds
Pull nfsd updates from Bruce Fields: "Highlights: - server-side nfs/rdma fixes from Jeff Layton and Tom Tucker - xdr fixes (a larger xdr rewrite has been posted but I decided it would be better to queue it up for 3.16). - miscellaneous fixes and cleanup from all over (thanks especially to Kinglong Mee)" * 'for-3.15' of git://linux-nfs.org/~bfields/linux: (36 commits) nfsd4: don't create unnecessary mask acl nfsd: revert v2 half of "nfsd: don't return high mode bits" nfsd4: fix memory leak in nfsd4_encode_fattr() nfsd: check passed socket's net matches NFSd superblock's one SUNRPC: Clear xpt_bc_xprt if xs_setup_bc_tcp failed NFSD/SUNRPC: Check rpc_xprt out of xs_setup_bc_tcp SUNRPC: New helper for creating client with rpc_xprt NFSD: Free backchannel xprt in bc_destroy NFSD: Clear wcc data between compound ops nfsd: Don't return NFS4ERR_STALE_STATEID for NFSv4.1+ nfsd4: fix nfs4err_resource in 4.1 case nfsd4: fix setclientid encode size nfsd4: remove redundant check from nfsd4_check_resp_size nfsd4: use more generous NFS4_ACL_MAX nfsd4: minor nfsd4_replay_cache_entry cleanup nfsd4: nfsd4_replay_cache_entry should be static nfsd4: update comments with obsolete function name rpc: Allow xdr_buf_subsegment to operate in-place NFSD: Using free_conn free connection SUNRPC: fix memory leak of peer addresses in XPRT ...
2014-04-08Merge branch 'akpm' (incoming from Andrew)Linus Torvalds
Merge a few more patches from Andrew Morton: "A few leftovers" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: fs/ncpfs/dir.c: fix indenting in ncp_lookup() ncpfs/inode.c: fix mismatch printk formats and arguments ncpfs: remove now unused PRINTK macro ncpfs: convert PPRINTK to ncp_vdbg ncpfs: convert DPRINTK/DDPRINTK to ncp_dbg ncpfs: Add pr_fmt and convert printks to pr_<level> arch/x86/mm/kmemcheck/kmemcheck.c: use kstrtoint() instead of sscanf() lib/percpu_counter.c: fix bad percpu counter state during suspend autofs4: check dev ioctl size before allocating mm: vmscan: do not swap anon pages just because free+file is low
2014-04-08fs/ncpfs/dir.c: fix indenting in ncp_lookup()Dan Carpenter
My static checker suggests adding curly braces here. Probably that was the intent, but actually the code works the same either way. I've just changed the indenting and left the code as-is. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: Petr Vandrovec <petr@vandrovec.name> Acked-by: Dave Chiluk <chiluk@canonical.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-08ncpfs/inode.c: fix mismatch printk formats and argumentsJoe Perches
Conversions to ncp_dbg showed some format/argument mismatches so fix them. Signed-off-by: Joe Perches <joe@perches.com> Cc: Petr Vandrovec <petr@vandrovec.name> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-08ncpfs: remove now unused PRINTK macroJoe Perches
Uses are gone, remove the macro. Signed-off-by: Joe Perches <joe@perches.com> Cc: Petr Vandrovec <petr@vandrovec.name> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-08ncpfs: convert PPRINTK to ncp_vdbgJoe Perches
Use a more current logging style. Convert the paranoia debug statement to vdbg. Remove the embedded function names as dynamic_debug can do that. Signed-off-by: Joe Perches <joe@perches.com> Cc: Petr Vandrovec <petr@vandrovec.name> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-08ncpfs: convert DPRINTK/DDPRINTK to ncp_dbgJoe Perches
Use a more current logging style and enable use of dynamic debugging. Remove embedded function names, dynamic debug can add this instead. Signed-off-by: Joe Perches <joe@perches.com> Cc: Petr Vandrovec <petr@vandrovec.name> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-08ncpfs: Add pr_fmt and convert printks to pr_<level>Joe Perches
Convert to a more current logging style. Add pr_fmt to prefix with "ncpfs: ". Remove the embedded function names and use "%s: ", __func__ Some previously unprefixed messages now have "ncpfs: " Signed-off-by: Joe Perches <joe@perches.com> Cc: Petr Vandrovec <petr@vandrovec.name> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-08arch/x86/mm/kmemcheck/kmemcheck.c: use kstrtoint() instead of sscanf()David Rientjes
Kmemcheck should use the preferred interface for parsing command line arguments, kstrto*(), rather than sscanf() itself. Use it appropriately. Signed-off-by: David Rientjes <rientjes@google.com> Cc: Vegard Nossum <vegardno@ifi.uio.no> Acked-by: Pekka Enberg <penberg@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-08lib/percpu_counter.c: fix bad percpu counter state during suspendJens Axboe
I got a bug report yesterday from Laszlo Ersek in which he states that his kvm instance fails to suspend. Laszlo bisected it down to this commit 1cf7e9c68fe8 ("virtio_blk: blk-mq support") where virtio-blk is converted to use the blk-mq infrastructure. After digging a bit, it became clear that the issue was with the queue drain. blk-mq tracks queue usage in a percpu counter, which is incremented on request alloc and decremented when the request is freed. The initial hunt was for an inconsistency in blk-mq, but everything seemed fine. In fact, the counter only returned crazy values when suspend was in progress. When a CPU is unplugged, the percpu counters merges that CPU state with the general state. blk-mq takes care to register a hotcpu notifier with the appropriate priority, so we know it runs after the percpu counter notifier. However, the percpu counter notifier only merges the state when the CPU is fully gone. This leaves a state transition where the CPU going away is no longer in the online mask, yet it still holds private values. This means that in this state, percpu_counter_sum() returns invalid results, and the suspend then hangs waiting for abs(dead-cpu-value) requests to complete which of course will never happen. Fix this by clearing the state earlier, so we never have a case where the CPU isn't in online mask but still holds private state. This bug has been there since forever, I guess we don't have a lot of users where percpu counters needs to be reliable during the suspend cycle. Signed-off-by: Jens Axboe <axboe@fb.com> Reported-by: Laszlo Ersek <lersek@redhat.com> Tested-by: Laszlo Ersek <lersek@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-08autofs4: check dev ioctl size before allocatingSasha Levin
There wasn't any check of the size passed from userspace before trying to allocate the memory required. This meant that userspace might request more space than allowed, triggering an OOM. Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-08mm: vmscan: do not swap anon pages just because free+file is lowJohannes Weiner
Page reclaim force-scans / swaps anonymous pages when file cache drops below the high watermark of a zone in order to prevent what little cache remains from thrashing. However, on bigger machines the high watermark value can be quite large and when the workload is dominated by a static anonymous/shmem set, the file set might just be a small window of used-once cache. In such situations, the VM starts swapping heavily when instead it should be recycling the no longer used cache. This is a longer-standing problem, but it's more likely to trigger after commit 81c0a2bb515f ("mm: page_alloc: fair zone allocator policy") because file pages can no longer accumulate in a single zone and are dispersed into smaller fractions among the available zones. To resolve this, do not force scan anon when file pages are low but instead rely on the scan/rotation ratios to make the right prediction. Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Rafael Aquini <aquini@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Hugh Dickins <hughd@google.com> Cc: Suleiman Souhlal <suleiman@google.com> Cc: <stable@kernel.org> [3.12+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-08Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull more networking updates from David Miller: 1) If a VXLAN interface is created with no groups, we can crash on reception of packets. Fix from Mike Rapoport. 2) Missing includes in CPTS driver, from Alexei Starovoitov. 3) Fix string validations in isdnloop driver, from YOSHIFUJI Hideaki and Dan Carpenter. 4) Missing irq.h include in bnxw2x, enic, and qlcnic drivers. From Josh Boyer. 5) AF_PACKET transmit doesn't statistically count TX drops, from Daniel Borkmann. 6) Byte-Queue-Limit enabled drivers aren't handled properly in AF_PACKET transmit path, also from Daniel Borkmann. Same problem exists in pktgen, and Daniel fixed it there too. 7) Fix resource leaks in driver probe error paths of new sxgbe driver, from Francois Romieu. 8) Truesize of SKBs can gradually get more and more corrupted in NAPI packet recycling path, fix from Eric Dumazet. 9) Fix uniprocessor netfilter build, from Florian Westphal. In the longer term we should perhaps try to find a way for ARRAY_SIZE() to work even with zero sized array elements. 10) Fix crash in netfilter conntrack extensions due to mis-estimation of required extension space. From Andrey Vagin. 11) Since we commit table rule updates before trying to copy the counters back to userspace (it's the last action we perform), we really can't signal the user copy with an error as we are beyond the point from which we can unwind everything. This causes all kinds of use after free crashes and other mysterious behavior. From Thomas Graf. 12) Restore previous behvaior of div/mod by zero in BPF filter processing. From Daniel Borkmann. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (38 commits) net: sctp: wake up all assocs if sndbuf policy is per socket isdnloop: several buffer overflows netdev: remove potentially harmful checks pktgen: fix xmit test for BQL enabled devices net/at91_ether: avoid NULL pointer dereference tipc: Let tipc_release() return 0 at86rf230: fix MAX_CSMA_RETRIES parameter mac802154: fix duplicate #include headers sxgbe: fix duplicate #include headers net: filter: be more defensive on div/mod by X==0 netfilter: Can't fail and free after table replacement xen-netback: Trivial format string fix net: bcmgenet: Remove unnecessary version.h inclusion net: smc911x: Remove unused local variable bonding: Inactive slaves should keep inactive flag's value netfilter: nf_tables: fix wrong format in request_module() netfilter: nf_tables: set names cannot be larger than 15 bytes netfilter: nf_conntrack: reserve two bytes for nf_ct_ext->len netfilter: Add {ipt,ip6t}_osf aliases for xt_osf netfilter: x_tables: allow to use cgroup match for LOCAL_IN nf hooks ...
2014-04-08Merge tag 'staging-3.15-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging Pull more staging patches from Greg KH: "Here are some more staging patches for 3.15-rc1. They include a late-submission of a wireless driver that a bunch of people seem to have the hardware for now. As it's stand-alone, it should be fine (now passes the 0-day random build bot tests). There are also some fixes for the unisys drivers, as they were causing havoc on a number of different machines. To resolve all of those issues, we just mark the driver as BROKEN now, and we can fix it up "properly" over time" * tag 'staging-3.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: staging: rtl8723au: The 8723 only has two paths Staging: unisys: mark drivers as BROKEN Staging: unisys: verify that a control channel exists staging: unisys: Add missing close parentheses in filexfer.c staging: r8723au: Fix build problem when RFKILL is not selected staging: r8723au: Fix randconfig build errors staging: r8723au: Turn on build of new driver staging: r8723au: Additional source patches staging: r8723au: Add source files for new driver - part 4 staging: r8723au: Add source files for new driver - part 3 staging: r8723au: Add source files for new driver - part 2 staging: r8723au: Add source files for new driver - part 1
2014-04-08Merge tag 'arm64-upstream' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull second set of arm64 updates from Catalin Marinas: "A second pull request for this merging window, mainly with fixes and docs clarification: - Documentation clarification on CPU topology and booting requirements - Additional cache flushing during boot (needed in the presence of external caches or under virtualisation) - DMA range invalidation fix for non cache line aligned buffers - Build failure fix with !COMPAT - Kconfig update for STRICT_DEVMEM" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: Fix DMA range invalidation for cache line unaligned buffers arm64: Add missing Kconfig for CONFIG_STRICT_DEVMEM arm64: fix !CONFIG_COMPAT build failures Revert "arm64: virt: ensure visibility of __boot_cpu_mode" arm64: Relax the kernel cache requirements for boot arm64: Update the TCR_EL1 translation granule definitions for 16K pages ARM: topology: Make it clear that all CPUs need to be described
2014-04-08Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull second set of s390 patches from Martin Schwidefsky: "The second part of Heikos uaccess rework, the page table walker for uaccess is now a thing of the past (yay!) The code change to fix the theoretical TLB flush problem allows us to add a TLB flush optimization for zEC12, this machine has new instructions that allow to do CPU local TLB flushes for single pages and for all pages of a specific address space. Plus the usual bug fixing and some more cleanup" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/uaccess: rework uaccess code - fix locking issues s390/mm,tlb: optimize TLB flushing for zEC12 s390/mm,tlb: safeguard against speculative TLB creation s390/irq: Use defines for external interruption codes s390/irq: Add defines for external interruption codes s390/sclp: add timeout for queued requests kvm/s390: also set guest pages back to stable on kexec/kdump lcs: Add missing destroy_timer_on_stack() s390/tape: Add missing destroy_timer_on_stack() s390/tape: Use del_timer_sync() s390/3270: fix crash with multiple reset device requests s390/bitops,atomic: add missing memory barriers s390/zcrypt: add length check for aligned data to avoid overflow in msg-type 6
2014-04-08net: sctp: wake up all assocs if sndbuf policy is per socketDaniel Borkmann
SCTP charges chunks for wmem accounting via skb->truesize in sctp_set_owner_w(), and sctp_wfree() respectively as the reverse operation. If a sender runs out of wmem, it needs to wait via sctp_wait_for_sndbuf(), and gets woken up by a call to __sctp_write_space() mostly via sctp_wfree(). __sctp_write_space() is being called per association. Although we assign sk->sk_write_space() to sctp_write_space(), which is then being done per socket, it is only used if send space is increased per socket option (SO_SNDBUF), as SOCK_USE_WRITE_QUEUE is set and therefore not invoked in sock_wfree(). Commit 4c3a5bdae293 ("sctp: Don't charge for data in sndbuf again when transmitting packet") fixed an issue where in case sctp_packet_transmit() manages to queue up more than sndbuf bytes, sctp_wait_for_sndbuf() will never be woken up again unless it is interrupted by a signal. However, a still remaining issue is that if net.sctp.sndbuf_policy=0, that is accounting per socket, and one-to-many sockets are in use, the reclaimed write space from sctp_wfree() is 'unfairly' handed back on the server to the association that is the lucky one to be woken up again via __sctp_write_space(), while the remaining associations are never be woken up again (unless by a signal). The effect disappears with net.sctp.sndbuf_policy=1, that is wmem accounting per association, as it guarantees a fair share of wmem among associations. Therefore, if we have reclaimed memory in case of per socket accounting, wake all related associations to a socket in a fair manner, that is, traverse the socket association list starting from the current neighbour of the association and issue a __sctp_write_space() to everyone until we end up waking ourselves. This guarantees that no association is preferred over another and even if more associations are taken into the one-to-many session, all receivers will get messages from the server and are not stalled forever on high load. This setting still leaves the advantage of per socket accounting in touch as an association can still use up global limits if unused by others. Fixes: 4eb701dfc618 ("[SCTP] Fix SCTP sendbuffer accouting.") Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Cc: Thomas Graf <tgraf@suug.ch> Cc: Neil Horman <nhorman@tuxdriver.com> Cc: Vlad Yasevich <vyasevic@redhat.com> Acked-by: Vlad Yasevich <vyasevic@redhat.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-08Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linuxLinus Torvalds
Pull drm updates from Dave Airlie: "Highlights: - drm: Generic display port aux features, primary plane support, drm master management fixes, logging cleanups, enforced locking checks (instead of docs), documentation improvements, minor number handling cleanup, pseudofs for shared inodes. - ttm: add ability to allocate from both ends - i915: broadwell features, power domain and runtime pm, per-process address space infrastructure (not enabled) - msm: power management, hdmi audio support - nouveau: ongoing GPU fault recovery, initial maxwell support, random fixes - exynos: refactored driver to clean up a lot of abstraction, DP support moved into drm, LVDS bridge support added, parallel panel support - gma500: SGX MMU support, SGX irq handling, asle irq work fixes - radeon: video engine bringup, ring handling fixes, use dp aux helpers - vmwgfx: add rendernode support" * 'drm-next' of git://people.freedesktop.org/~airlied/linux: (849 commits) DRM: armada: fix corruption while loading cursors drm/dp_helper: don't return EPROTO for defers (v2) drm/bridge: export ptn3460_init function drm/exynos: remove MODULE_DEVICE_TABLE definitions ARM: dts: exynos4412-trats2: enable exynos/fimd node ARM: dts: exynos4210-trats: enable exynos/fimd node ARM: dts: exynos4412-trats2: add panel node ARM: dts: exynos4210-trats: add panel node ARM: dts: exynos4: add MIPI DSI Master node drm/panel: add S6E8AA0 driver ARM: dts: exynos4210-universal_c210: add proper panel node drm/panel: add ld9040 driver panel/ld9040: add DT bindings panel/s6e8aa0: add DT bindings drm/exynos: add DSIM driver exynos/dsim: add DT bindings drm/exynos: disallow fbdev initialization if no device is connected drm/mipi_dsi: create dsi devices only for nodes with reg property drm/mipi_dsi: add flags to DSI messages Skip intel_crt_init for Dell XPS 8700 ...
2014-04-08isdnloop: several buffer overflowsDan Carpenter
There are three buffer overflows addressed in this patch. 1) In isdnloop_fake_err() we add an 'E' to a 60 character string and then copy it into a 60 character buffer. I have made the destination buffer 64 characters and I'm changed the sprintf() to a snprintf(). 2) In isdnloop_parse_cmd(), p points to a 6 characters into a 60 character buffer so we have 54 characters. The ->eazlist[] is 11 characters long. I have modified the code to return if the source buffer is too long. 3) In isdnloop_command() the cbuf[] array was 60 characters long but the max length of the string then can be up to 79 characters. I made the cbuf array 80 characters long and changed the sprintf() to snprintf(). I also removed the temporary "dial" buffer and changed it to use "p" directly. Unfortunately, we pass the "cbuf" string from isdnloop_command() to isdnloop_writecmd() which truncates anything over 60 characters to make it fit in card->omsg[]. (It can accept values up to 255 characters so long as there is a '\n' character every 60 characters). For now I have just fixed the memory corruption bug and left the other problems in this driver alone. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-08include/linux/syscalls.h: add sys_renameat2() prototypeHeiko Carstens
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-08arm64: Fix DMA range invalidation for cache line unaligned buffersCatalin Marinas
If the buffer needing cache invalidation for inbound DMA does start or end on a cache line aligned address, we need to use the non-destructive clean&invalidate operation. This issue was introduced by commit 7363590d2c46 (arm64: Implement coherent DMA API based on swiotlb). Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Reported-by: Jon Medhurst (Tixy) <tixy@linaro.org>
2014-04-08Merge branch 'for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull ext3 improvements, cleanups, reiserfs fix from Jan Kara: "various cleanups for ext2, ext3, udf, isofs, a documentation update for quota, and a fix of a race in reiserfs readdir implementation" * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: reiserfs: fix race in readdir ext2: acl: remove unneeded include of linux/capability.h ext3: explicitly remove inode from orphan list after failed direct io fs/isofs/inode.c add __init to init_inodecache() ext3: Speedup WB_SYNC_ALL pass fs/quota/Kconfig: Update filesystems ext3: Update outdated comment before ext3_ordered_writepage() ext3: Update PF_MEMALLOC handling in ext3_write_inode() ext2/3: use prandom_u32() instead of get_random_bytes() ext3: remove an unneeded check in ext3_new_blocks() ext3: remove unneeded check in ext3_ordered_writepage() fs: Mark function as static in ext3/xattr_security.c fs: Mark function as static in ext3/dir.c fs: Mark function as static in ext2/xattr_security.c ext3: Add __init macro to init_inodecache ext2: Add __init macro to init_inodecache udf: Add __init macro to init_inodecache fs: udf: parse_options: blocksize check
2014-04-08Merge branch 'kbuild' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild Pull kbuild changes from Michal Marek: - cleanups in the main Makefiles and Documentation/DocBook/Makefile - make O=... directory is automatically created if needed - mrproper/distclean removes the old include/linux/version.h to make life easier when bisecting across the commit that moved the version.h file * 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild: kbuild: docbook: fix the include error when executing "make help" kbuild: create a build directory automatically for out-of-tree build kbuild: remove redundant '.*.cmd' pattern from make distclean kbuild: move "quote" to Kbuild.include to be consistent kbuild: docbook: use $(obj) and $(src) rather than specific path kbuild: unconditionally clobber include/linux/version.h on distclean kbuild: docbook: specify KERNELDOC dependency correctly kbuild: docbook: include cmd files more simply kbuild: specify build_docproc as a phony target
2014-04-08Merge tag 'arc-v3.15-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc Pull ARC changes from Vineet Gupta: - Support for external initrd from Noam - Fix broken serial console in nsimosci Virtual Platform - Reuse of ENTRY/END assembler macros across hand asm code - Other minor fixes here and there * tag 'arc-v3.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: ARC: [nsimosci] Unbork console ARC: [nsimosci] Change .dts to use generic 8250 UART ARC: [SMP] General Fixes ARC: Remove unused DT template file ARC: [clockevent] simplify timer ISR ARC: [clockevent] can't be SoC specific ARC: Remove ARC_HAS_COH_RTSC ARC: switch to generic ENTRY/END assembler annotations ARC: support external initrd ARC: add uImage to .gitignore ARC: [arcfpga] Fix __initconst data const-correctness
2014-04-08DRM: armada: fix corruption while loading cursorsRussell King
Loading cursors to the LCD controller's SRAM can be corrupted when the configured pixel clock is relatively slow. This seems to be caused when we write back-to-back to the SRAM registers. There doesn't appear to be any status register we can read to check when an access has completed. Inserting a dummy read between the writes appears to fix the problem. Cc: <stable@vger.kernel.org> # 3.13 Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-04-08Merge tag 'stable/for-linus-3.15-tag2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull Xen build fix from David Vrabel: "Fix arm build of drivers/xen/events/ The merge of irq-core-for-linus branch broke it" * tag 'stable/for-linus-3.15-tag2' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: Xen: do hv callback accounting only on x86
2014-04-07Merge branch 'akpm' (incoming from Andrew)Linus Torvalds
Merge second patch-bomb from Andrew Morton: - the rest of MM - zram updates - zswap updates - exit - procfs - exec - wait - crash dump - lib/idr - rapidio - adfs, affs, bfs, ufs - cris - Kconfig things - initramfs - small amount of IPC material - percpu enhancements - early ioremap support - various other misc things * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (156 commits) MAINTAINERS: update Intel C600 SAS driver maintainers fs/ufs: remove unused ufs_super_block_third pointer fs/ufs: remove unused ufs_super_block_second pointer fs/ufs: remove unused ufs_super_block_first pointer fs/ufs/super.c: add __init to init_inodecache() doc/kernel-parameters.txt: add early_ioremap_debug arm64: add early_ioremap support arm64: initialize pgprot info earlier in boot x86: use generic early_ioremap mm: create generic early_ioremap() support x86/mm: sparse warning fix for early_memremap lglock: map to spinlock when !CONFIG_SMP percpu: add preemption checks to __this_cpu ops vmstat: use raw_cpu_ops to avoid false positives on preemption checks slub: use raw_cpu_inc for incrementing statistics net: replace __this_cpu_inc in route.c with raw_cpu_inc modules: use raw_cpu_write for initialization of per cpu refcount. mm: use raw_cpu ops for determining current NUMA node percpu: add raw_cpu_ops slub: fix leak of 'name' in sysfs_slab_add ...
2014-04-07MAINTAINERS: update Intel C600 SAS driver maintainersLukasz Dorau
Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Maciej Patelczyk <maciej.patelczyk@intel.com> Cc: Artur Paszkiewicz <artur.paszkiewicz@intel.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-07fs/ufs: remove unused ufs_super_block_third pointerChristian Engelmayer
Pointer 'usb3' to struct ufs_super_block_third acquired via ubh_get_usb_third() is never used in function ufs_read_cylinder_structures(). Thus remove it. Detected by Coverity: CID 139939. Signed-off-by: Christian Engelmayer <cengelma@gmx.at> Cc: Evgeniy Dushistov <dushistov@mail.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-07fs/ufs: remove unused ufs_super_block_second pointerChristian Engelmayer
Pointer 'usb2' to struct ufs_super_block_second acquired via ubh_get_usb_second() is never used in function ufs_statfs(). Thus remove it. Detected by Coverity: CID 139940. Signed-off-by: Christian Engelmayer <cengelma@gmx.at> Cc: Evgeniy Dushistov <dushistov@mail.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-07fs/ufs: remove unused ufs_super_block_first pointerChristian Engelmayer
Remove occurences of unused pointers to struct ufs_super_block_first that were acquired via ubh_get_usb_first(). Detected by Coverity: CID 139929 - CID 139936, CID 139940. Signed-off-by: Christian Engelmayer <cengelma@gmx.at> Cc: Evgeniy Dushistov <dushistov@mail.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-07fs/ufs/super.c: add __init to init_inodecache()Fabian Frederick
init_inodecache is only called by __init init_ufs_fs. Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Evgeniy Dushistov <dushistov@mail.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-07doc/kernel-parameters.txt: add early_ioremap_debugMark Salter
Add description of early_ioremap_debug kernel parameter. Signed-off-by: Mark Salter <msalter@redhat.com> Cc: Borislav Petkov <borislav.petkov@amd.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dave Young <dyoung@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-07arm64: add early_ioremap supportMark Salter
Add support for early IO or memory mappings which are needed before the normal ioremap() is usable. This also adds fixmap support for permanent fixed mappings such as that used by the earlyprintk device register region. Signed-off-by: Mark Salter <msalter@redhat.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Borislav Petkov <borislav.petkov@amd.com> Cc: Dave Young <dyoung@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-07arm64: initialize pgprot info earlier in bootMark Salter
Presently, paging_init() calls init_mem_pgprot() to initialize pgprot values used by macros such as PAGE_KERNEL, PAGE_KERNEL_EXEC, etc. The new fixmap and early_ioremap support also needs to use these macros before paging_init() is called. This patch moves the init_mem_pgprot() call out of paging_init() and into setup_arch() so that pgprot_default gets initialized in time for fixmap and early_ioremap. Signed-off-by: Mark Salter <msalter@redhat.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Borislav Petkov <borislav.petkov@amd.com> Cc: Dave Young <dyoung@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-07x86: use generic early_ioremapMark Salter
Move x86 over to the generic early ioremap implementation. Signed-off-by: Mark Salter <msalter@redhat.com> Acked-by: H. Peter Anvin <hpa@zytor.com> Cc: Borislav Petkov <borislav.petkov@amd.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dave Young <dyoung@redhat.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-07mm: create generic early_ioremap() supportMark Salter
This patch creates a generic implementation of early_ioremap() support based on the existing x86 implementation. early_ioremp() is useful for early boot code which needs to temporarily map I/O or memory regions before normal mapping functions such as ioremap() are available. Some architectures have optional MMU. In the no-MMU case, the remap functions simply return the passed in physical address and the unmap functions do nothing. Signed-off-by: Mark Salter <msalter@redhat.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: H. Peter Anvin <hpa@zytor.com> Cc: Borislav Petkov <borislav.petkov@amd.com> Cc: Dave Young <dyoung@redhat.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-07x86/mm: sparse warning fix for early_memremapDave Young
This patch series takes the common bits from the x86 early ioremap implementation and creates a generic implementation which may be used by other architectures. The early ioremap interfaces are intended for situations where boot code needs to make temporary virtual mappings before the normal ioremap interfaces are available. Typically, this means before paging_init() has run. This patch (of 6): There's a lot of sparse warnings for code like below: void *a = early_memremap(phys_addr, size); early_memremap intend to map kernel memory with ioremap facility, the return pointer should be a kernel ram pointer instead of iomem one. For making the function clearer and supressing sparse warnings this patch do below two things: 1. cast to (__force void *) for the return value of early_memremap 2. add early_memunmap function and pass (__force void __iomem *) to iounmap From Boris: "Ingo told me yesterday, it makes sense too. I'd guess we can try it. FWIW, all callers of early_memremap use the memory they get remapped as normal memory so we should be safe" Signed-off-by: Dave Young <dyoung@redhat.com> Signed-off-by: Mark Salter <msalter@redhat.com> Acked-by: H. Peter Anvin <hpa@zytor.com> Cc: Borislav Petkov <borislav.petkov@amd.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-07lglock: map to spinlock when !CONFIG_SMPJosh Triplett
When the system has only one CPU, lglock is effectively a spinlock; map it directly to spinlock to eliminate the indirection and duplicate code. In addition to removing overhead, this drops 1.6k of code with a defconfig modified to have !CONFIG_SMP, and 1.1k with a minimal config. Signed-off-by: Josh Triplett <josh@joshtriplett.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Michal Marek <mmarek@suse.cz> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: David Howells <dhowells@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Nick Piggin <npiggin@kernel.dk> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-07percpu: add preemption checks to __this_cpu opsChristoph Lameter
We define a check function in order to avoid trouble with the include files. Then the higher level __this_cpu macros are modified to invoke the preemption check. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Christoph Lameter <cl@linux.com> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: Tejun Heo <tj@kernel.org> Tested-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-07vmstat: use raw_cpu_ops to avoid false positives on preemption checksChristoph Lameter
vm counters are allowed to be racy. Use raw_cpu_ops to avoid the local_irq_disable overhead and to avoid preemption checks which will be added to the __this_cpu operations. [akpm@linux-foundation.org: Add comment. Again.] Signed-off-by: Christoph Lameter <cl@linux.com> Reported-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Dave Chinner <dchinner@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-07slub: use raw_cpu_inc for incrementing statisticsChristoph Lameter
Statistics are not critical to the operation of the allocation but should also not cause too much overhead. When __this_cpu_inc is altered to check if preemption is disabled this triggers. Use raw_cpu_inc to avoid the checks. Using this_cpu_ops may cause interrupt disable/enable sequences on various arches which may significantly impact allocator performance. [akpm@linux-foundation.org: add comment] Signed-off-by: Christoph Lameter <cl@linux.com> Cc: Fengguang Wu <fengguang.wu@intel.com> Cc: Pekka Enberg <penberg@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-07net: replace __this_cpu_inc in route.c with raw_cpu_incChristoph Lameter
The RT_CACHE_STAT_INC macro triggers the new preemption checks for __this_cpu ops. I do not see any other synchronization that would allow the use of a __this_cpu operation here however in commit dbd2915ce87e ("[IPV4]: RT_CACHE_STAT_INC() warning fix") Andrew justifies the use of raw_smp_processor_id() here because "we do not care" about races. In the past we agreed that the price of disabling interrupts here to get consistent counters would be too high. These counters may be inaccurate due to race conditions. The use of __this_cpu op improves the situation already from what commit dbd2915ce87e did since the single instruction emitted on x86 does not allow the race to occur anymore. However, non x86 platforms could still experience a race here. Trace: __this_cpu_add operation in preemptible [00000000] code: avahi-daemon/1193 caller is __this_cpu_preempt_check+0x38/0x60 CPU: 1 PID: 1193 Comm: avahi-daemon Tainted: GF 3.12.0-rc4+ #187 Call Trace: check_preemption_disabled+0xec/0x110 __this_cpu_preempt_check+0x38/0x60 __ip_route_output_key+0x575/0x8c0 ip_route_output_flow+0x27/0x70 udp_sendmsg+0x825/0xa20 inet_sendmsg+0x85/0xc0 sock_sendmsg+0x9c/0xd0 ___sys_sendmsg+0x37c/0x390 __sys_sendmsg+0x49/0x90 SyS_sendmsg+0x12/0x20 tracesys+0xe1/0xe6 Signed-off-by: Christoph Lameter <cl@linux.com> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: Eric Dumazet <edumazet@google.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-07modules: use raw_cpu_write for initialization of per cpu refcount.Christoph Lameter
The initialization of a structure is not subject to synchronization. The use of __this_cpu would trigger a false positive with the additional preemption checks for __this_cpu ops. So simply disable the check through the use of raw_cpu ops. Trace: __this_cpu_write operation in preemptible [00000000] code: modprobe/286 caller is __this_cpu_preempt_check+0x38/0x60 CPU: 3 PID: 286 Comm: modprobe Tainted: GF 3.12.0-rc4+ #187 Call Trace: dump_stack+0x4e/0x82 check_preemption_disabled+0xec/0x110 __this_cpu_preempt_check+0x38/0x60 load_module+0xcfd/0x2650 SyS_init_module+0xa6/0xd0 tracesys+0xe1/0xe6 Signed-off-by: Christoph Lameter <cl@linux.com> Acked-by: Ingo Molnar <mingo@kernel.org> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-07mm: use raw_cpu ops for determining current NUMA nodeChristoph Lameter
With the preempt checking logic for __this_cpu_ops we will get false positives from locations in the code that use numa_node_id. Before the __this_cpu ops where introduced there were no checks for preemption present either. smp_raw_processor_id() was used. See http://www.spinics.net/lists/linux-numa/msg00641.html Therefore we need to use raw_cpu_read here to avoid false postives. Note that this issue has been discussed in prior years. If the process changes nodes after retrieving the current numa node then that is acceptable since most uses of numa_node etc are for optimization and not for correctness. There were suggestions to implement a raw_numa_node_id in order to do preempt checks for numa_node_id as well. But I think we better defer that to another patch since that would mean investigating how numa_node_id() is used throughout the kernel which would increase the scope of this patchset significantly. After all preemption was never checked before when numa_node_id() was used. Some sample traces: __this_cpu_read operation in preemptible [00000000] code: login/1456 caller is __this_cpu_preempt_check+0x2b/0x2d CPU: 0 PID: 1456 Comm: login Not tainted 3.12.0-rc4-cl-00062-g2fe80d3-dirty #185 Call Trace: dump_stack+0x4e/0x82 check_preemption_disabled+0xc5/0xe0 __this_cpu_preempt_check+0x2b/0x2d get_task_policy+0x1d/0x49 get_vma_policy+0x14/0x76 alloc_pages_vma+0x35/0xff handle_mm_fault+0x290/0x73b __do_page_fault+0x3fe/0x44d do_page_fault+0x9/0xc page_fault+0x22/0x30 generic_file_aio_read+0x38e/0x624 do_sync_read+0x54/0x73 vfs_read+0x9d/0x12a SyS_read+0x47/0x7e cstar_dispatch+0x7/0x23 caller is __this_cpu_preempt_check+0x2b/0x2d CPU: 0 PID: 1456 Comm: login Not tainted 3.12.0-rc4-cl-00062-g2fe80d3-dirty #185 Call Trace: dump_stack+0x4e/0x82 check_preemption_disabled+0xc5/0xe0 __this_cpu_preempt_check+0x2b/0x2d alloc_pages_current+0x8f/0xbc __page_cache_alloc+0xb/0xd __do_page_cache_readahead+0xf4/0x219 ra_submit+0x1c/0x20 ondemand_readahead+0x28c/0x2b4 page_cache_sync_readahead+0x38/0x3a generic_file_aio_read+0x261/0x624 do_sync_read+0x54/0x73 vfs_read+0x9d/0x12a SyS_read+0x47/0x7e cstar_dispatch+0x7/0x23 Signed-off-by: Christoph Lameter <cl@linux.com> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: Alex Shi <alex.shi@intel.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-07percpu: add raw_cpu_opsChristoph Lameter
The kernel has never been audited to ensure that this_cpu operations are consistently used throughout the kernel. The code generated in many places can be improved through the use of this_cpu operations (which uses a segment register for relocation of per cpu offsets instead of performing address calculations). The patch set also addresses various consistency issues in general with the per cpu macros. A. The semantics of __this_cpu_ptr() differs from this_cpu_ptr only because checks are skipped. This is typically shown through a raw_ prefix. So this patch set changes the places where __this_cpu_ptr() is used to raw_cpu_ptr(). B. There has been the long term wish by some that __this_cpu operations would check for preemption. However, there are cases where preemption checks need to be skipped. This patch set adds raw_cpu operations that do not check for preemption and then adds preemption checks to the __this_cpu operations. C. The use of __get_cpu_var is always a reference to a percpu variable that can also be handled via a this_cpu operation. This patch set replaces all uses of __get_cpu_var with this_cpu operations. D. We can then use this_cpu RMW operations in various places replacing sequences of instructions by a single one. E. The use of this_cpu operations throughout will allow other arches than x86 to implement optimized references and RMV operations to work with per cpu local data. F. The use of this_cpu operations opens up the possibility to further optimize code that relies on synchronization through per cpu data. The patch set works in a couple of stages: I. Patch 1 adds the additional raw_cpu operations and raw_cpu_ptr(). Also converts the existing __this_cpu_xx_# primitive in the x86 code to raw_cpu_xx_#. II. Patch 2-4 use the raw_cpu operations in places that would give us false positives once they are enabled. III. Patch 5 adds preemption checks to __this_cpu operations to allow checking if preemption is properly disabled when these functions are used. IV. Patches 6-20 are patches that simply replace uses of __get_cpu_var with this_cpu_ptr. They do not depend on any changes to the percpu code. No preemption tests are skipped if they are applied. V. Patches 21-46 are conversion patches that use this_cpu operations in various kernel subsystems/drivers or arch code. VI. Patches 47/48 (not included in this series) remove no longer used functions (__this_cpu_ptr and __get_cpu_var). These should only be applied after all the conversion patches have made it and after we have done additional passes through the kernel to ensure that none of the uses of these functions remain. This patch (of 46): The patches following this one will add preemption checks to __this_cpu ops so we need to have an alternative way to use this_cpu operations without preemption checks. raw_cpu_ops will be the basis for all other ops since these will be the operations that do not implement any checks. Primitive operations are renamed by this patch from __this_cpu_xxx to raw_cpu_xxxx. Also change the uses of the x86 percpu primitives in preempt.h. These depend directly on asm/percpu.h (header #include nesting issue). Signed-off-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Christoph Lameter <cl@linux.com> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: Tejun Heo <tj@kernel.org> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Alex Shi <alex.shi@intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Bryan Wu <cooloney@gmail.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: David Daney <david.daney@cavium.com> Cc: David Miller <davem@davemloft.net> Cc: David S. Miller <davem@davemloft.net> Cc: Dimitri Sivanich <sivanich@sgi.com> Cc: Dipankar Sarma <dipankar@in.ibm.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@linux.intel.com> Cc: Haavard Skinnemoen <hskinnemoen@gmail.com> Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no> Cc: Hedi Berriche <hedi@sgi.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: James Hogan <james.hogan@imgtec.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: John Stultz <john.stultz@linaro.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Mike Frysinger <vapier@gentoo.org> Cc: Mike Travis <travis@sgi.com> Cc: Neil Brown <neilb@suse.de> Cc: Nicolas Pitre <nicolas.pitre@linaro.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Robert Richter <rric@kernel.org> Cc: Russell King <linux@arm.linux.org.uk> Cc: Russell King <rmk+kernel@arm.linux.org.uk> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Wim Van Sebroeck <wim@iguana.be> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-07slub: fix leak of 'name' in sysfs_slab_addDave Jones
The failure paths of sysfs_slab_add don't release the allocation of 'name' made by create_unique_id() a few lines above the context of the diff below. Create a common exit path to make it more obvious what needs freeing. [vdavydov@parallels.com: free the name only if !unmergeable] Signed-off-by: Dave Jones <davej@fedoraproject.org> Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Cc: Pekka Enberg <penberg@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-07slub: rework sysfs layout for memcg cachesVladimir Davydov
Currently, we try to arrange sysfs entries for memcg caches in the same manner as for global caches. Apart from turning /sys/kernel/slab into a mess when there are a lot of kmem-active memcgs created, it actually does not work properly - we won't create more than one link to a memcg cache in case its parent is merged with another cache. For instance, if A is a root cache merged with another root cache B, we will have the following sysfs setup: X A -> X B -> X where X is some unique id (see create_unique_id()). Now if memcgs M and N start to allocate from cache A (or B, which is the same), we will get: X X:M X:N A -> X B -> X A:M -> X:M A:N -> X:N Since B is an alias for A, we won't get entries B:M and B:N, which is confusing. It is more logical to have entries for memcg caches under the corresponding root cache's sysfs directory. This would allow us to keep sysfs layout clean, and avoid such inconsistencies like one described above. This patch does the trick. It creates a "cgroup" kset in each root cache kobject to keep its children caches there. Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: David Rientjes <rientjes@google.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Glauber Costa <glommer@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-07slub: adjust memcg caches when creating cache aliasVladimir Davydov
Otherwise, kzalloc() called from a memcg won't clear the whole object. Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: David Rientjes <rientjes@google.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Glauber Costa <glommer@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-07memcg, slab: do not destroy children caches if parent has aliasesVladimir Davydov
Currently we destroy children caches at the very beginning of kmem_cache_destroy(). This is wrong, because the root cache will not necessarily be destroyed in the end - if it has aliases (refcount > 0), kmem_cache_destroy() will simply decrement its refcount and return. In this case, at best we will get a bunch of warnings in dmesg, like this one: kmem_cache_destroy kmalloc-32:0: Slab cache still has objects CPU: 1 PID: 7139 Comm: modprobe Tainted: G B W 3.13.0+ #117 Call Trace: dump_stack+0x49/0x5b kmem_cache_destroy+0xdf/0xf0 kmem_cache_destroy_memcg_children+0x97/0xc0 kmem_cache_destroy+0xf/0xf0 xfs_mru_cache_uninit+0x21/0x30 [xfs] exit_xfs_fs+0x2e/0xc44 [xfs] SyS_delete_module+0x198/0x1f0 system_call_fastpath+0x16/0x1b At worst - if kmem_cache_destroy() will race with an allocation from a memcg cache - the kernel will panic. This patch fixes this by moving children caches destruction after the check if the cache has aliases. Plus, it forbids destroying a root cache if it still has children caches, because each children cache keeps a reference to its parent. Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: David Rientjes <rientjes@google.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Glauber Costa <glommer@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>