summaryrefslogtreecommitdiff
path: root/lib
AgeCommit message (Collapse)Author
2012-07-24Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds
Pull networking changes from David S Miller: 1) Remove the ipv4 routing cache. Now lookups go directly into the FIB trie and use prebuilt routes cached there. No more garbage collection, no more rDOS attacks on the routing cache. Instead we now get predictable and consistent performance, no matter what the pattern of traffic we service. This has been almost 2 years in the making. Special thanks to Julian Anastasov, Eric Dumazet, Steffen Klassert, and others who have helped along the way. I'm sure that with a change of this magnitude there will be some kind of fallout, but such things ought the be simple to fix at this point. Luckily I'm not European so I'll be around all of August to fix things :-) The major stages of this work here are each fronted by a forced merge commit whose commit message contains a top-level description of the motivations and implementation issues. 2) Pre-demux of established ipv4 TCP sockets, saves a route demux on input. 3) TCP SYN/ACK performance tweaks from Eric Dumazet. 4) Add namespace support for netfilter L4 conntrack helpers, from Gao Feng. 5) Add config mechanism for Energy Efficient Ethernet to ethtool, from Yuval Mintz. 6) Remove quadratic behavior from /proc/net/unix, from Eric Dumazet. 7) Support for connection tracker helpers in userspace, from Pablo Neira Ayuso. 8) Allow userspace driven TX load balancing functions in TEAM driver, from Jiri Pirko. 9) Kill off NLMSG_PUT and RTA_PUT macros, more gross stuff with embedded gotos. 10) TCP Small Queues, essentially minimize the amount of TCP data queued up in the packet scheduler layer. Whereas the existing BQL (Byte Queue Limits) limits the pkt_sched --> netdevice queuing levels, this controls the TCP --> pkt_sched queueing levels. From Eric Dumazet. 11) Reduce the number of get_page/put_page ops done on SKB fragments, from Alexander Duyck. 12) Implement protection against blind resets in TCP (RFC 5961), from Eric Dumazet. 13) Support the client side of TCP Fast Open, basically the ability to send data in the SYN exchange, from Yuchung Cheng. Basically, the sender queues up data with a sendmsg() call using MSG_FASTOPEN, then they do the connect() which emits the queued up fastopen data. 14) Avoid all the problems we get into in TCP when timers or PMTU events hit a locked socket. The TCP Small Queues changes added a tcp_release_cb() that allows us to queue work up to the release_sock() caller, and that's what we use here too. From Eric Dumazet. 15) Zero copy on TX support for TUN driver, from Michael S. Tsirkin. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1870 commits) genetlink: define lockdep_genl_is_held() when CONFIG_LOCKDEP r8169: revert "add byte queue limit support". ipv4: Change rt->rt_iif encoding. net: Make skb->skb_iif always track skb->dev ipv4: Prepare for change of rt->rt_iif encoding. ipv4: Remove all RTCF_DIRECTSRC handliing. ipv4: Really ignore ICMP address requests/replies. decnet: Don't set RTCF_DIRECTSRC. net/ipv4/ip_vti.c: Fix __rcu warnings detected by sparse. ipv4: Remove redundant assignment rds: set correct msg_namelen openvswitch: potential NULL deref in sample() tcp: dont drop MTU reduction indications bnx2x: Add new 57840 device IDs tcp: avoid oops in tcp_metrics and reset tcpm_stamp niu: Change niu_rbr_fill() to use unlikely() to check niu_rbr_add_page() return value niu: Fix to check for dma mapping errors. net: Fix references to out-of-scope variables in put_cmsg_compat() net: ethernet: davinci_emac: add pm_runtime support net: ethernet: davinci_emac: Remove unnecessary #include ...
2012-07-24Merge tag 'please-pull-misc-3.6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux Pull misc Itanium fixes from Tony Luck. * tag 'please-pull-misc-3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux: debug: Do not permit CONFIG_DEBUG_STACK_USAGE=y on IA64 or PARISC [IA64] Port OOM changes to ia64_do_page_fault
2012-07-24Merge branch 'next' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull security subsystem updates from James Morris: "Nothing groundbreaking for this kernel, just cleanups and fixes, and a couple of Smack enhancements." * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (21 commits) Smack: Maintainer Record Smack: don't show empty rules when /smack/load or /smack/load2 is read Smack: user access check bounds Smack: onlycap limits on CAP_MAC_ADMIN Smack: fix smack_new_inode bogosities ima: audit is compiled only when enabled ima: ima_initialized is set only if successful ima: add policy for pseudo fs ima: remove unused cleanup functions ima: free securityfs violations file ima: use full pathnames in measurement list security: Fix nommu build. samples: seccomp: add .gitignore for untracked executables tpm: check the chip reference before using it TPM: fix memleak when register hardware fails TPM: chip disabled state erronously being reported as error MAINTAINERS: TPM maintainers' contacts update Merge branches 'next-queue' and 'next' into next Remove unused code from MPI library Revert "crypto: GnuPG based MPI lib - additional sources (part 4)" ...
2012-07-22Merge branch 'core-rcu-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RCU changes from Ingo Molnar: "Quoting from Paul, the major features of this series are: 1. Preventing latency spikes of more than 200 microseconds for kernels built with NR_CPUS=4096, which is reportedly becoming the default for some distros. This is a first step, as it does not help with systems that actually -have- 4096 CPUs (work on this case is in progress, but is not yet ready for mainline). This category also includes improving concurrency of rcu_barrier(), placed here due to conflicts. Posted to LKML at: https://lkml.org/lkml/2012/6/22/381 Note that patches 18-22 of that series have been defered to 3.7, as they have not yet proven themselves to be mainline-ready (and yes, these are the ones intended to get rid of RCU's latency spikes for systems that actually have 4096 CPUs). 2. Updates to documentation and rcutorture fixes, the latter category including improvements to rcu_barrier() testing. Posted to LKML at http://lkml.indiana.edu/hypermail/linux/kernel/1206.1/04094.html. 3. Miscellaneous fixes posted to LKML at: https://lkml.org/lkml/2012/6/22/500 with the exception of the last commit, which was posted here: http://www.gossamer-threads.com/lists/linux/kernel/1561830 4. RCU_FAST_NO_HZ fixes and improvements. Posted to LKML at: http://lkml.indiana.edu/hypermail/linux/kernel/1206.1/00006.html http://www.gossamer-threads.com/lists/linux/kernel/1561833 The first four patches of the first series went into 3.5 to fix a regression. 5. Code-style fixes. These were posted to LKML at http://lkml.indiana.edu/hypermail/linux/kernel/1205.2/01180.html http://lkml.indiana.edu/hypermail/linux/kernel/1205.2/01181.html" * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (48 commits) rcu: Fix broken strings in RCU's source code. rcu: Fix code-style issues involving "else" rcu: Introduce check for callback list/count mismatch rcu: Make RCU_FAST_NO_HZ respect nohz= boot parameter rcu: Fix qlen_lazy breakage rcu: Round FAST_NO_HZ lazy timeout to nearest second rcu: The rcu_needs_cpu() function is not a quiescent state rcu: Dump only the current CPU's buffers for idle-entry/exit warnings rcu: Add check for CPUs going offline with callbacks queued rcu: Disable preemption in rcu_blocking_is_gp() rcu: Prevent uninitialized string in RCU CPU stall info rcu: Fix rcu_is_cpu_idle() #ifdef in TINY_RCU rcu: Split RCU core processing out of __call_rcu() rcu: Prevent __call_rcu() from invoking RCU core on offline CPUs rcu: Make __call_rcu() handle invocation from idle rcu: Remove function versions of __kfree_rcu and __is_kfree_rcu_offset rcu: Consolidate tree/tiny __rcu_read_{,un}lock() implementations rcu: Remove return value from rcu_assign_pointer() key: Remove extraneous parentheses from rcu_assign_keypointer() rcu: Remove return value from RCU_INIT_POINTER() ...
2012-07-19Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Conflicts: drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
2012-07-19debug: Do not permit CONFIG_DEBUG_STACK_USAGE=y on IA64 or PARISCTony Luck
The stack_not_used() function in <linux/sched.h> assumes that stacks grow downwards. This is not true on IA64 or PARISC, so this function would walk off in the wrong direction and into the weeds. Found on IA64 because of a compilation failure with recursive dependencies on IA64_TASKSIZE and IA64_THREAD_INFO_SIZE. Fixing the code is possible, but should be combined with other infrastructure additions to set up the "canary" at the end of the stack. Reported-by: Fengguang Wu <fengguang.wu@intel.com> (failed allmodconfig build) Signed-off-by: Tony Luck <tony.luck@intel.com>
2012-07-09Merge tag 'iommu-fixes-v3.5-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull IOMMU fixes from Joerg Roedel: "The patches fix several issues in the AMD IOMMU driver, the NVidia SMMU driver, and the DMA debug code. The most important fix for the AMD IOMMU solves a problem with SR-IOV devices where virtual functions did not work with IOMMU enabled. The NVidia SMMU patch fixes a possible sleep while spin-lock situation (queued the small fix for v3.5, a better but more intrusive fix is coming for v3.6). The DMA debug patches fix a possible data corruption issue due to bool vs u32 usage." * tag 'iommu-fixes-v3.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: iommu/amd: fix type bug in flush code dma-debug: debugfs_create_bool() takes a u32 pointer iommu/tegra: smmu: Fix unsleepable memory allocation iommu/amd: Initialize dma_ops for hotplug and sriov devices iommu/amd: Fix missing iommu_shutdown initialization in passthrough mode
2012-07-06rcu: Fix broken strings in RCU's source code.Paul E. McKenney
Although the C language allows you to break strings across lines, doing this makes it hard for people to find the Linux kernel code corresponding to a given console message. This commit therefore fixes broken strings throughout RCU's source code. Suggested-by: Josh Triplett <josh@joshtriplett.org> Suggested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2012-07-02dma-debug: debugfs_create_bool() takes a u32 pointerDan Carpenter
Even though it has "bool" in the name, you have pass a u32 pointer to debugfs_create_bool(). Otherwise you get memory corruption in write_file_bool(). Fortunately in this case the corruption happens in an alignment hole between variables so it doesn't cause any problems. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2012-06-29netlink: add netlink_kernel_cfg parameter to netlink_kernel_createPablo Neira Ayuso
This patch adds the following structure: struct netlink_kernel_cfg { unsigned int groups; void (*input)(struct sk_buff *skb); struct mutex *cb_mutex; }; That can be passed to netlink_kernel_create to set optional configurations for netlink kernel sockets. I've populated this structure by looking for NULL and zero parameters at the existing code. The remaining parameters that always need to be set are still left in the original interface. That includes optional parameters for the netlink socket creation. This allows easy extensibility of this interface in the future. This patch also adapts all callers to use this new interface. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-06-20fault-inject: avoid call to random32() if fault injection is disabledAnton Blanchard
After enabling CONFIG_FAILSLAB I noticed random32 in profiles even if slub fault injection wasn't enabled at runtime. should_fail forces a comparison against random32() even if probability is 0: if (attr->probability <= random32() % 100) return false; Add a check up front for probability == 0 and avoid all of the more complicated checks. Signed-off-by: Anton Blanchard <anton@samba.org> Acked-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-06-15Merge branch 'core-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull core updates (RCU and locking) from Ingo Molnar: "Most of the diffstat comes from the RCU slow boot regression fixes, but there's also a debuggability improvements/fixes." * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: memblock: Document memblock_is_region_{memory,reserved}() rcu: Precompute RCU_FAST_NO_HZ timer offsets rcu: Move RCU_FAST_NO_HZ per-CPU variables to rcu_dynticks structure rcu: Update RCU_FAST_NO_HZ tracing for lazy callbacks rcu: RCU_FAST_NO_HZ detection of callback adoption spinlock: Indicate that a lockup is only suspected kdump: Execute kmsg_dump(KMSG_DUMP_PANIC) after smp_send_stop() panic: Make panic_on_oops configurable
2012-06-10Merge commit 'v3.5-rc2' into nextJames Morris
2012-06-07btree: catch NULL value before it does harmJoern Engel
Storing NULL values in the btree is illegal and can lead to memory corruption and possible other fun as well. Catch it on insert, instead of waiting for the inevitable. Signed-off-by: Joern Engel <joern@logfs.org> Signed-off-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-06-07btree: fix tree corruption in btree_get_prev()Roland Dreier
The memory the parameter __key points to is used as an iterator in btree_get_prev(), so if we save off a bkey() pointer in retry_key and then assign that to __key, we'll end up corrupting the btree internals when we do eg longcpy(__key, bkey(geo, node, i), geo->keylen); to return the key value. What we should do instead is use longcpy() to copy the key value that retry_key points to __key. This can cause a btree to get corrupted by seemingly read-only operations such as btree_for_each_safe. [akpm@linux-foundation.org: avoid the double longcpy()] Signed-off-by: Roland Dreier <roland@purestorage.com> Acked-by: Joern Engel <joern@logfs.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-06-06Merge tag 'md-3.5-fixes' of git://neil.brown.name/mdLinus Torvalds
Pull two md fixes from NeilBrown: "One sparse-warning fix, one bugfix for 3.4-stable" * tag 'md-3.5-fixes' of git://neil.brown.name/md: md: raid1/raid10: fix problem with merge_bvec_fn lib/raid6: fix sparse warnings in recovery functions
2012-06-06spinlock: Indicate that a lockup is only suspectedChristian Borntraeger
On an over-committed KVM system we got a: "BUG: spinlock lockup on CPU#2, swapper/2/0" message on the heavily contended virtio blk spinlock. While we might want to reconsider the locking of virtio-blk (lock is held while switching to the host) this patch tries to make the message clearer: the lockup is only suspected. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1338283124-7063-1-git-send-email-borntraeger@de.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-06Merge branch 'core/debug' into core/urgentIngo Molnar
Merge two debugging patchlets that were waiting for preparatory commits to hit upstream. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-05radix-tree: fix contiguous iteratorKonstantin Khlebnikov
This patch fixes bug in macro radix_tree_for_each_contig(). If radix_tree_next_slot() sees NULL in next slot it returns NULL, but following radix_tree_next_chunk() switches iterating into next chunk. As result iterating becomes non-contiguous and breaks vfs "splice" and all its users. Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Reported-and-bisected-by: Hans de Bruin <jmdebruin@xmsnet.nl> Reported-and-bisected-by: Ondrej Zary <linux@rainbow-software.org> Reported-bisected-and-tested-by: Toralf Förster <toralf.foerster@gmx.de> Link: https://lkml.org/lkml/2012/6/5/64 Cc: stable <stable@vger.kernel.org> # 3.4.x Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-06-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking updates from David Miller: 1) Make syn floods consume significantly less resources by a) Not pre-COW'ing routing metrics for SYN/ACKs b) Mirroring the device queue mapping of the SYN for the SYN/ACK reply. Both from Eric Dumazet. 2) Fix calculation errors in Byte Queue Limiting, from Hiroaki SHIMODA. 3) Validate the length requested when building a paged SKB for a socket, so we don't overrun the page vector accidently. From Jason Wang. 4) When netlabel is disabled, we abort all IP option processing when we see a CIPSO option. This isn't the right thing to do, we should simply skip over it and continue processing the remaining options (if any). Fix from Paul Moore. 5) SRIOV fixes for the mellanox driver from Jack orgenstein and Marcel Apfelbaum. 6) 8139cp enables the receiver before the ring address is properly programmed, which potentially lets the device crap over random memory. Fix from Jason Wang. 7) e1000/e1000e fixes for i217 RST handling, and an improper buffer address reference in jumbo RX frame processing from Bruce Allan and Sebastian Andrzej Siewior, respectively. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: fec_mpc52xx: fix timestamp filtering mcs7830: Implement link state detection e1000e: fix Rapid Start Technology support for i217 e1000: look into the page instead of skb->data for e1000_tbi_adjust_stats() r8169: call netif_napi_del at errpaths and at driver unload tcp: reflect SYN queue_mapping into SYNACK packets tcp: do not create inetpeer on SYNACK message 8139cp/8139too: terminate the eeprom access with the right opmode 8139cp: set ring address before enabling receiver cipso: handle CIPSO options correctly when NetLabel is disabled net: sock: validate data_len before allocating skb in sock_alloc_send_pskb() bql: Avoid possible inconsistent calculation. bql: Avoid unneeded limit decrement. bql: Fix POSDIFF() to integer overflow aware. net/mlx4_core: Fix obscure mlx4_cmd_box parameter in QUERY_DEV_CAP net/mlx4_core: Check port out-of-range before using in mlx4_slave_cap net/mlx4_core: Fixes for VF / Guest startup flow net/mlx4_en: Fix improper use of "port" parameter in mlx4_en_event net/mlx4_core: Fix number of EQs used in ICM initialisation net/mlx4_core: Fix the slave_id out-of-range test in mlx4_eq_int
2012-06-01Merge branch 'akpm' (Andrew's patch-bomb)Linus Torvalds
Merge misc patches from Andrew Morton: - the "misc" tree - stuff from all over the map - checkpatch updates - fatfs - kmod changes - procfs - cpumask - UML - kexec - mqueue - rapidio - pidns - some checkpoint-restore feature work. Reluctantly. Most of it delayed a release. I'm still rather worried that we don't have a clear roadmap to completion for this work. * emailed from Andrew Morton <akpm@linux-foundation.org>: (78 patches) kconfig: update compression algorithm info c/r: prctl: add ability to set new mm_struct::exe_file c/r: prctl: extend PR_SET_MM to set up more mm_struct entries c/r: procfs: add arg_start/end, env_start/end and exit_code members to /proc/$pid/stat syscalls, x86: add __NR_kcmp syscall fs, proc: introduce /proc/<pid>/task/<tid>/children entry sysctl: make kernel.ns_last_pid control dependent on CHECKPOINT_RESTORE aio/vfs: cleanup of rw_copy_check_uvector() and compat_rw_copy_check_uvector() eventfd: change int to __u64 in eventfd_signal() fs/nls: add Apple NLS pidns: make killed children autoreap pidns: use task_active_pid_ns in do_notify_parent rapidio/tsi721: add DMA engine support rapidio: add DMA engine support for RIO data transfers ipc/mqueue: add rbtree node caching support tools/selftests: add mq_perf_tests ipc/mqueue: strengthen checks on mqueue creation ipc/mqueue: correct mq_attr_ok test ipc/mqueue: improve performance of send/recv selftests: add mq_open_tests ...
2012-06-01vsprintf: further optimize decimal conversionDenys Vlasenko
Previous code was using optimizations which were developed to work well even on narrow-word CPUs (by today's standards). But Linux runs only on 32-bit and wider CPUs. We can use that. First: using 32x32->64 multiply and trivial 32-bit shift, we can correctly divide by 10 much larger numbers, and thus we can print groups of 9 digits instead of groups of 5 digits. Next: there are two algorithms to print larger numbers. One is generic: divide by 1000000000 and repeatedly print groups of (up to) 9 digits. It's conceptually simple, but requires an (unsigned long long) / 1000000000 division. Second algorithm splits 64-bit unsigned long long into 16-bit chunks, manipulates them cleverly and generates groups of 4 decimal digits. It so happens that it does NOT require long long division. If long is > 32 bits, division of 64-bit values is relatively easy, and we will use the first algorithm. If long long is > 64 bits (strange architecture with VERY large long long), second algorithm can't be used, and we again use the first one. Else (if long is 32 bits and long long is 64 bits) we use second one. And third: there is a simple optimization which takes fast path not only for zero as was done before, but for all one-digit numbers. In all tested cases new code is faster than old one, in many cases by 30%, in few cases by more than 50% (for example, on x86-32, conversion of 12345678). Code growth is ~0 in 32-bit case and ~130 bytes in 64-bit case. This patch is based upon an original from Michal Nazarewicz. [akpm@linux-foundation.org: checkpatch fixes] Signed-off-by: Michal Nazarewicz <mina86@mina86.com> Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com> Cc: Douglas W Jones <jones@cs.uiowa.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-06-01vsprintf: correctly handle width when '#' flag used in %#p formatGrant Likely
The '%p' output of the kernel's vsprintf() uses spec.field_width to determine how many digits to output based on 2 * sizeof(void*) so that all digits of a pointer are shown. ie. a pointer will be output as "001A2B3C" instead of "1A2B3C". However, if the '#' flag is used in the format (%#p), then the code doesn't take into account the width of the '0x' prefix and will end up outputing "0x1A2B3C" instead of "0x001A2B3C". This patch reworks the "pointer()" format hook to include 2 characters for the '0x' prefix if the '#' flag is included. [akpm@linux-foundation.org: checkpatch fixes] Signed-off-by: Grant Likely <grant.likely@secretlab.ca> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-31bql: Avoid possible inconsistent calculation.Hiroaki SHIMODA
dql->num_queued could change while processing dql_completed(). To provide consistent calculation, added an on stack variable. Signed-off-by: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com> Cc: Tom Herbert <therbert@google.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Denys Fedoryshchenko <denys@visp.net.lb> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-05-31bql: Avoid unneeded limit decrement.Hiroaki SHIMODA
When below pattern is observed, TIME dql_queued() dql_completed() | a) initial state | | b) X bytes queued V c) Y bytes queued d) X bytes completed e) Z bytes queued f) Y bytes completed a) dql->limit has already some value and there is no in-flight packet. b) X bytes queued. c) Y bytes queued and excess limit. d) X bytes completed and dql->prev_ovlimit is set and also dql->prev_num_queued is set Y. e) Z bytes queued. f) Y bytes completed. inprogress and prev_inprogress are true. At f), according to the comment, all_prev_completed becomes true and limit should be increased. But POSDIFF() ignores (completed == dql->prev_num_queued) case, so limit is decreased. Signed-off-by: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com> Cc: Tom Herbert <therbert@google.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Denys Fedoryshchenko <denys@visp.net.lb> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-05-31bql: Fix POSDIFF() to integer overflow aware.Hiroaki SHIMODA
POSDIFF() fails to take into account integer overflow case. Signed-off-by: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com> Cc: Tom Herbert <therbert@google.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Denys Fedoryshchenko <denys@visp.net.lb> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-05-30Merge tag 'iommu-updates-v3.5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull IOMMU updates from Joerg Roedel: "Not much stuff this time. The only change to the IOMMU core code is the addition of a handle to the fault handling code. A few updates to the AMD IOMMU driver to work around new errata. The other patches are mostly fixes and enhancements to the existing ARM IOMMU drivers and documentation updates. A new IOMMU driver for the Exynos platform was also underway but got merged via the Samsung tree and is not part of this tree." * tag 'iommu-updates-v3.5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: Documentation: kernel-parameters.txt Add amd_iommu_dump iommu/core: pass a user-provided token to fault handlers iommu/tegra: gart: Fix register offset correctly iommu: OMAP: device detach on domain destroy iommu: tegra/gart: Add device tree support iommu: tegra/gart: use correct gart_device iommu/tegra: smmu: Print device name correctly iommu/amd: Add workaround for event log erratum iommu/amd: Check for the right TLP prefix bit dma-debug: release free_entries_lock before saving stack trace
2012-05-30Merge branches 'iommu/fixes', 'dma-debug', 'arm/omap', 'arm/tegra', 'core' ↵Joerg Roedel
and 'x86/amd' into next
2012-05-29lib/vsprintf.c: "%#o",0 becomes '0' instead of '00'Pierre Carrier
number()'s behaviour is slighly changed: 0 becomes "0" instead of "00" when using the flag SPECIAL and base 8. Before: Number\Format %o %#o %x %#x 0 0 00 0 0x0 1 1 01 1 0x1 16 20 020 10 0x10 After: Number\Format %o %#o %x %#x 0 0 0 0 0x0 1 1 01 1 0x1 16 20 020 10 0x10 Signed-off-by: Pierre Carrier <pierre@spotify.com> Acked-by: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-29radix-tree: fix preload vector sizeNick Piggin
We are not preallocating a sufficient number of nodes. Signed-off-by: Nick Piggin <npiggin@kernel.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-29spinlock_debug: print kallsyms name for lockStephen Boyd
When a spinlock warning is printed we usually get BUG: spinlock bad magic on CPU#0, modprobe/111 lock: 0xdff09f38, .magic: 00000000, .owner: /0, .owner_cpu: 0 but it's nicer to print the symbol for the lock if we have it so that we can avoid 'grep dff09f38 /proc/kallsyms' to find out which lock it was. Use kallsyms to print the symbol name so we get something a bit easier to read BUG: spinlock bad magic on CPU#0, modprobe/112 lock: test_lock, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0 If the lock is not in kallsyms %ps will fall back to printing the address directly. Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-29vsprintf: fix %ps on non symbols when using kallsymsStephen Boyd
Using %ps in a printk format will sometimes fail silently and print the empty string if the address passed in does not match a symbol that kallsyms knows about. But using %pS will fall back to printing the full address if kallsyms can't find the symbol. Make %ps act the same as %pS by falling back to printing the address. While we're here also make %ps print the module that a symbol comes from so that it matches what %pS already does. Take this simple function for example (in a module): static void test_printk(void) { int test; pr_info("with pS: %pS\n", &test); pr_info("with ps: %ps\n", &test); } Before this patch: with pS: 0xdff7df44 with ps: After this patch: with pS: 0xdff7df44 with ps: 0xdff7df44 Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-29lib/bitmap.c: fix documentation for scnprintf() functionsAndrew Morton
The code comments for bscnl_emit() and bitmap_scnlistprintf() are describing snprintf() return semantics, but these functions use scnprintf() return semantics. Fix that, and document the bitmap_scnprintf() return value as well. Cc: Ryota Ozaki <ozaki.ryota@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-29lib/string_helpers.c: make arrays staticAndrew Morton
Moving these arrays into static storage shrinks the kernel a bit: text data bss dec hex filename 723 112 64 899 383 lib/string_helpers.o 516 272 64 852 354 lib/string_helpers.o Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-29lib/test-kstrtox.c: mark const init data with __initconst instead of __initdataUwe Kleine-König
As long as there is no other non-const variable marked __initdata in the same compilation unit it doesn't hurt. If there were one however compilation would fail with error: $variablename causes a section type conflict because a section containing const variables is marked read only and so cannot contain non-const variables. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Cc: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-29list_debug: WARN for adding something already in the listChris Metcalf
We were bitten by this at one point and added an additional sanity test for DEBUG_LIST. You can't validly add a list_head to a list where either prev or next is the same as the thing you're adding. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com> Cc: Andi Kleen <andi@firstfloor.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-29swiotlb: print physical addresses consistently with other parts of kernelBjorn Helgaas
Print swiotlb info in a style consistent with the %pR style used elsewhere in the kernel. For example: -Placing 64MB software IO TLB between ffff88007a662000 - ffff88007e662000 -software IO TLB at phys 0x7a662000 - 0x7e662000 +software IO TLB [mem 0x7a662000-0x7e661fff] (64MB) mapped at [ffff88007a662000-ffff88007e661fff] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-28lib/raid6: fix sparse warnings in recovery functionsJim Kukunas
Make the recovery functions static to fix the following sparse warnings: lib/raid6/recov.c:25:6: warning: symbol 'raid6_2data_recov_intx1' was not declared. Should it be static? lib/raid6/recov.c:69:6: warning: symbol 'raid6_datap_recov_intx1' was not declared. Should it be static? lib/raid6/recov_ssse3.c:22:6: warning: symbol 'raid6_2data_recov_ssse3' was not declared. Should it be static? lib/raid6/recov_ssse3.c:197:6: warning: symbol 'raid6_datap_recov_ssse3' was not declared. Should it be static? Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Jim Kukunas <james.t.kukunas@linux.intel.com> Signed-off-by: NeilBrown <neilb@suse.de>
2012-05-28lib: Fix generic strnlen_user for 32-bit big-endian machinesPaul Mackerras
The aligned_byte_mask() definition is wrong for 32-bit big-endian machines: the "7-(n)" part of the definition assumes a long is 8 bytes. This fixes it by using BITS_PER_LONG - 8 instead of 8*7. Tested on 32-bit and 64-bit PowerPC. Signed-off-by: Paul Mackerras <paulus@samba.org> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-26Merge branch 'generic-string-functions'Linus Torvalds
This makes <asm/word-at-a-time.h> actually live up to its promise of allowing architectures to help tune the string functions that do their work a word at a time. David had already taken the x86 strncpy_from_user() function, modified it to work on sparc, and then done the extra work to make it generically useful. This then expands on that work by making x86 use that generic version, completing the circle. But more importantly, it fixes up the word-at-a-time interfaces so that it's now easy to also support things like strnlen_user(), and pretty much most random string functions. David reports that it all works fine on sparc, and Jonas Bonn reported that an earlier version of this worked on OpenRISC too. It's pretty easy for architectures to add support for this and just replace their private versions with the generic code. * generic-string-functions: sparc: use the new generic strnlen_user() function x86: use the new generic strnlen_user() function lib: add generic strnlen_user() function word-at-a-time: make the interfaces truly generic x86: use generic strncpy_from_user routine
2012-05-26Merge tag 'stmp-dev' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull arm-soc stmp-dev library code from Olof Johansson: "A number of devices are using a common register layout, this adds support code for it in lib/stmp_device.c so we do not need to duplicate it in each driver." Fix up trivial conflicts in drivers/i2c/busses/i2c-mxs.c and lib/Makefile * tag 'stmp-dev' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: i2c: mxs: use global reset function lib: add support for stmp-style devices
2012-05-26lib: add generic strnlen_user() functionLinus Torvalds
This adds a new generic optimized strnlen_user() function that uses the <asm/word-at-a-time.h> infrastructure to portably do efficient string handling. In many ways, strnlen is much simpler than strncpy, and in particular we can always pre-align the words we load from memory. That means that all the worries about alignment etc are a non-issue, so this one can easily be used on any architecture. You obviously do have to do the appropriate word-at-a-time.h macros. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-26word-at-a-time: make the interfaces truly genericLinus Torvalds
This changes the interfaces in <asm/word-at-a-time.h> to be a bit more complicated, but a lot more generic. In particular, it allows us to really do the operations efficiently on both little-endian and big-endian machines, pretty much regardless of machine details. For example, if you can rely on a fast population count instruction on your architecture, this will allow you to make your optimized <asm/word-at-a-time.h> file with that. NOTE! The "generic" version in include/asm-generic/word-at-a-time.h is not truly generic, it actually only works on big-endian. Why? Because on little-endian the generic algorithms are wasteful, since you can inevitably do better. The x86 implementation is an example of that. (The only truly non-generic part of the asm-generic implementation is the "find_zero()" function, and you could make a little-endian version of it. And if the Kbuild infrastructure allowed us to pick a particular header file, that would be lovely) The <asm/word-at-a-time.h> functions are as follows: - WORD_AT_A_TIME_CONSTANTS: specific constants that the algorithm uses. - has_zero(): take a word, and determine if it has a zero byte in it. It gets the word, the pointer to the constant pool, and a pointer to an intermediate "data" field it can set. This is the "quick-and-dirty" zero tester: it's what is run inside the hot loops. - "prep_zero_mask()": take the word, the data that has_zero() produced, and the constant pool, and generate an *exact* mask of which byte had the first zero. This is run directly *outside* the loop, and allows the "has_zero()" function to answer the "is there a zero byte" question without necessarily getting exactly *which* byte is the first one to contain a zero. If you do multiple byte lookups concurrently (eg "hash_name()", which looks for both NUL and '/' bytes), after you've done the prep_zero_mask() phase, the result of those can be or'ed together to get the "either or" case. - The result from "prep_zero_mask()" can then be fed into "find_zero()" (to find the byte offset of the first byte that was zero) or into "zero_bytemask()" (to find the bytemask of the bytes preceding the zero byte). The existence of zero_bytemask() is optional, and is not necessary for the normal string routines. But dentry name hashing needs it, so if you enable DENTRY_WORD_AT_A_TIME you need to expose it. This changes the generic strncpy_from_user() function and the dentry hashing functions to use these modified word-at-a-time interfaces. This gets us back to the optimized state of the x86 strncpy that we lost in the previous commit when moving over to the generic version. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-26Remove unused code from MPI libraryDmitry Kasatkin
MPI library is used by RSA verification implementation. Few files contains functions which are never called. James Morris has asked to remove all of them. Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com> Requested-by: James Morris <james.l.morris@oracle.com> Signed-off-by: James Morris <james.l.morris@oracle.com>
2012-05-26Revert "crypto: GnuPG based MPI lib - additional sources (part 4)"Dmitry Kasatkin
This reverts commit 7e8dec918ef8e0f68b4937c3c50fa57002077a4d. RSA verification implementation does not use this code. James Morris has asked to remove that. Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com> Requested-by: James Morris <james.l.morris@oracle.com> Signed-off-by: James Morris <james.l.morris@oracle.com>
2012-05-24Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparcLinus Torvalds
Pull sparc changes from David S. Miller: "This has the generic strncpy_from_user() implementation architectures can now use, which we've been developing on linux-arch over the past few days. For good measure I ran both a 32-bit and a 64-bit glibc testsuite run, and the latter of which pointed out an adjustment I needed to make to sparc's user_addr_max() definition. Linus, you were right, STACK_TOP was not the right thing to use, even on sparc itself :-) From Sam Ravnborg, we have a conversion of sparc32 over to the common alloc_thread_info_node(), since the aspect which originally blocked our doing so (sun4c) has been removed." Fix up trivial arch/sparc/Kconfig and lib/Makefile conflicts. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc: sparc: Fix user_addr_max() definition. lib: Sparc's strncpy_from_user is generic enough, move under lib/ kernel: Move REPEAT_BYTE definition into linux/kernel.h sparc: Increase portability of strncpy_from_user() implementation. sparc: Optimize strncpy_from_user() zero byte search. sparc: Add full proper error handling to strncpy_from_user(). sparc32: use the common implementation of alloc_thread_info_node()
2012-05-24lib: Sparc's strncpy_from_user is generic enough, move under lib/David S. Miller
To use this, an architecture simply needs to: 1) Provide a user_addr_max() implementation via asm/uaccess.h 2) Add "select GENERIC_STRNCPY_FROM_USER" to their arch Kcnfig 3) Remove the existing strncpy_from_user() implementation and symbol exports their architecture had. Signed-off-by: David S. Miller <davem@davemloft.net> Acked-by: David Howells <dhowells@redhat.com>
2012-05-24Merge tag 'md-3.5' of git://neil.brown.name/mdLinus Torvalds
Pull md updates from NeilBrown: "It's been a busy cycle for md - lots of fun stuff here.. if you like this kind of thing :-) Main features: - RAID10 arrays can be reshaped - adding and removing devices and changing chunks (not 'far' array though) - allow RAID5 arrays to be reshaped with a backup file (not tested yet, but the priciple works fine for RAID10). - arrays can be reshaped while a bitmap is present - you no longer need to remove it first - SSSE3 support for RAID6 syndrome calculations and of course a number of minor fixes etc." * tag 'md-3.5' of git://neil.brown.name/md: (56 commits) md/bitmap: record the space available for the bitmap in the superblock. md/raid10: Remove extras after reshape to smaller number of devices. md/raid5: improve removal of extra devices after reshape. md: check the return of mddev_find() MD RAID1: Further conditionalize 'fullsync' DM RAID: Use md_error() in place of simply setting Faulty bit DM RAID: Record and handle missing devices DM RAID: Set recovery flags on resume md/raid5: Allow reshape while a bitmap is present. md/raid10: resize bitmap when required during reshape. md: allow array to be resized while bitmap is present. md/bitmap: make sure reshape request are reflected in superblock. md/bitmap: add bitmap_resize function to allow bitmap resizing. md/bitmap: use DIV_ROUND_UP instead of open-code md/bitmap: create a 'struct bitmap_counts' substructure of 'struct bitmap' md/bitmap: make bitmap bitops atomic. md/bitmap: make _page_attr bitops atomic. md/bitmap: merge bitmap_file_unmap and bitmap_file_put. md/bitmap: remove async freeing of bitmap file. md/bitmap: convert some spin_lock_irqsave to spin_lock_irq ...
2012-05-23Merge branches 'x86-asm-for-linus', 'x86-cleanups-for-linus', ↵Linus Torvalds
'x86-cpu-for-linus', 'x86-debug-for-linus' and 'x86-microcode-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull initial trivial x86 stuff from Ingo Molnar. Various random cleanups and trivial fixes. * 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86-64: Eliminate dead ia32 syscall handlers * 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/pci-calgary_64.c: Remove obsoleted simple_strtoul() usage x86: Don't continue booting if we can't load the specified initrd x86: kernel/dumpstack.c simple_strtoul cleanup x86: kernel/check.c simple_strtoul cleanup debug: Add CONFIG_READABLE_ASM x86: spinlock.h: Remove REG_PTR_MODE * 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/cache_info: Fix setup of l2/l3 ids * 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86: Avoid double stack traces with show_regs() * 'x86-microcode-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, microcode: microcode_core.c simple_strtoul cleanup
2012-05-23Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial Pull trivial updates from Jiri Kosina: "As usual, it's mostly typo fixes, redundant code elimination and some documentation updates." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (57 commits) edac, mips: don't change code that has been removed in edac/mips tree xtensa: Change mail addresses of Hannes Weiner and Oskar Schirmer lib: Change mail address of Oskar Schirmer net: Change mail address of Oskar Schirmer arm/m68k: Change mail address of Sebastian Hess i2c: Change mail address of Oskar Schirmer net: Fix tcp_build_and_update_options comment in struct tcp_sock atomic64_32.h: fix parameter naming mismatch Kconfig: replace "--- help ---" with "---help---" c2port: fix bogus Kconfig "default no" edac: Fix spelling errors. qla1280: Remove redundant NULL check before release_firmware() call remoteproc: remove redundant NULL check before release_firmware() qla2xxx: Remove redundant NULL check before release_firmware() call. aic94xx: Get rid of redundant NULL check before release_firmware() call tehuti: delete redundant NULL check before release_firmware() qlogic: get rid of a redundant test for NULL before call to release_firmware() bna: remove redundant NULL test before release_firmware() tg3: remove redundant NULL test before release_firmware() call typhoon: get rid of redundant conditional before all to release_firmware() ...