summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)Author
2006-06-23[PATCH] mm: introduce remap_vmalloc_range()Nick Piggin
Add remap_vmalloc_range, vmalloc_user, and vmalloc_32_user so that drivers can have a nice interface for remapping vmalloc memory. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23[PATCH] Unify pxm_to_node() and node_to_pxm()Yasunori Goto
Consolidate the various arch-specific implementations of pxm_to_node() and node_to_pxm() into a single generic version. Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Andi Kleen <ak@muc.de> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: "Brown, Len" <len.brown@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23[PATCH] tightening hugetlb strict accountingChen, Kenneth W
Current hugetlb strict accounting for shared mapping always assume mapping starts at zero file offset and reserves pages between zero and size of the file. This assumption often reserves (or lock down) a lot more pages then necessary if application maps at none zero file offset. libhugetlbfs is one example that requires proper reservation on shared mapping starts at none zero offset. This patch extends the reservation and hugetlb strict accounting to support any arbitrary pair of (offset, len), resulting a much more robust and accurate scheme. More importantly, it won't lock down any hugetlb pages outside file mapping. Signed-off-by: Ken Chen <kenneth.w.chen@intel.com> Acked-by: Adam Litke <agl@us.ibm.com> Cc: David Gibson <david@gibson.dropbear.id.au> Cc: William Lee Irwin III <wli@holomorphy.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23[PATCH] reserve space for swap labelAndreas Dilger
Reserve space in the swap disk header for a LABEL and UUID to be specified. This has been possible with util-linux-2.12b (via e2fsprogs 1.36 libblkid), and is used by at least FC3 and later. The kernel doesn't really care about this, but the space shouldn't accidentally be used by something else either. Also make the on-disk structures be fixed-size types, instead of "int", though I don't know of any architecture in use where an "int" isn't the same size as a "__u32" (all current kernel arches have it as "unsigned int"). Signed-off-by: Andreas Dilger <adilger@shaw.ca> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23[PATCH] support for panic at OOMKAMEZAWA Hiroyuki
This patch adds panic_on_oom sysctl under sys.vm. When sysctl vm.panic_on_oom = 1, the kernel panics intead of killing rogue processes. And if vm.panic_on_oom is 0 the kernel will do oom_kill() in the same way as it does today. Of course, the default value is 0 and only root can modifies it. In general, oom_killer works well and kill rogue processes. So the whole system can survive. But there are environments where panic is preferable rather than kill some processes. Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23[PATCH] wait_table and zonelist initializing for memory hotadd: add return ↵Yasunori Goto
code for init_current_empty_zone When add_zone() is called against empty zone (not populated zone), we have to initialize the zone which didn't initialize at boot time. But, init_currently_empty_zone() may fail due to allocation of wait table. So, this patch is to catch its error code. Changes against wait_table is in the next patch. Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23[PATCH] wait_table and zonelist initializing for memory hotadd: change to ↵Yasunori Goto
meminit for build_zonelist Change definitions of some functions and data from __init to __meminit. These functions and data can be used after bootup by this patch to be used for hot-add codes. Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23[PATCH] wait_table and zonelist initializing for memory hotadd: change name ↵Yasunori Goto
of wait_table_size() This is just to rename from wait_table_size() to wait_table_hash_nr_entries(). Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23[PATCH] PG_uncached is ia64 onlyAndrew Morton
As Nick points out, only ia64 uses PG_uncached. So we can push it up into the higher bits of the lower half of page->flags and make room for another flag on 32-bit machines. Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Jesse Barnes <jbarnes@sgi.com> Cc: Jes Sorensen <jes@trained-monkey.org> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Andy Whitcroft <apw@shadowen.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23[PATCH] zone handle unaligned zone boundariesAndy Whitcroft
The buddy allocator has a requirement that boundaries between contigious zones occur aligned with the the MAX_ORDER ranges. Where they do not we will incorrectly merge pages cross zone boundaries. This can lead to pages from the wrong zone being handed out. Originally the buddy allocator would check that buddies were in the same zone by referencing the zone start and end page frame numbers. This was removed as it became very expensive and the buddy allocator already made the assumption that zones boundaries were aligned. It is clear that not all configurations and architectures are honouring this alignment requirement. Therefore it seems safest to reintroduce support for non-aligned zone boundaries. This patch introduces a new check when considering a page a buddy it compares the zone_table index for the two pages and refuses to merge the pages where they do not match. The zone_table index is unique for each node/zone combination when FLATMEM/DISCONTIGMEM is enabled and for each section/zone combination when SPARSEMEM is enabled (a SPARSEMEM section is at least a MAX_ORDER size). Signed-off-by: Andy Whitcroft <apw@shadowen.org> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Yasunori Goto <y-goto@jp.fujitsu.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23[PATCH] VFS: Permit filesystem to perform statfs with a known root dentryDavid Howells
Give the statfs superblock operation a dentry pointer rather than a superblock pointer. This complements the get_sb() patch. That reduced the significance of sb->s_root, allowing NFS to place a fake root there. However, NFS does require a dentry to use as a target for the statfs operation. This permits the root in the vfsmount to be used instead. linux/mount.h has been added where necessary to make allyesconfig build successfully. Interest has also been expressed for use with the FUSE and XFS filesystems. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Cc: Nathan Scott <nathans@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23[PATCH] VFS: Permit filesystem to override root dentry on mountDavid Howells
Extend the get_sb() filesystem operation to take an extra argument that permits the VFS to pass in the target vfsmount that defines the mountpoint. The filesystem is then required to manually set the superblock and root dentry pointers. For most filesystems, this should be done with simple_set_mnt() which will set the superblock pointer and then set the root dentry to the superblock's s_root (as per the old default behaviour). The get_sb() op now returns an integer as there's now no need to return the superblock pointer. This patch permits a superblock to be implicitly shared amongst several mount points, such as can be done with NFS to avoid potential inode aliasing. In such a case, simple_set_mnt() would not be called, and instead the mnt_root and mnt_sb would be set directly. The patch also makes the following changes: (*) the get_sb_*() convenience functions in the core kernel now take a vfsmount pointer argument and return an integer, so most filesystems have to change very little. (*) If one of the convenience function is not used, then get_sb() should normally call simple_set_mnt() to instantiate the vfsmount. This will always return 0, and so can be tail-called from get_sb(). (*) generic_shutdown_super() now calls shrink_dcache_sb() to clean up the dcache upon superblock destruction rather than shrink_dcache_anon(). This is required because the superblock may now have multiple trees that aren't actually bound to s_root, but that still need to be cleaned up. The currently called functions assume that the whole tree is rooted at s_root, and that anonymous dentries are not the roots of trees which results in dentries being left unculled. However, with the way NFS superblock sharing are currently set to be implemented, these assumptions are violated: the root of the filesystem is simply a dummy dentry and inode (the real inode for '/' may well be inaccessible), and all the vfsmounts are rooted on anonymous[*] dentries with child trees. [*] Anonymous until discovered from another tree. (*) The documentation has been adjusted, including the additional bit of changing ext2_* into foo_* in the documentation. [akpm@osdl.org: convert ipath_fs, do other stuff] Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Cc: Nathan Scott <nathans@sgi.com> Cc: Roland Dreier <rolandd@cisco.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23[NET]: fix net-core kernel-docRandy Dunlap
Warning(/var/linsrc/linux-2617-g4//include/linux/skbuff.h:304): No description found for parameter 'dma_cookie' Warning(/var/linsrc/linux-2617-g4//include/net/sock.h:1274): No description found for parameter 'copied_early' Warning(/var/linsrc/linux-2617-g4//net/core/dev.c:3309): No description found for parameter 'chan' Warning(/var/linsrc/linux-2617-g4//net/core/dev.c:3309): No description found for parameter 'event' Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23[TCP]: Move inclusion of <linux/dmaengine.h> to correct place in <linux/tcp.h>David Woodhouse
The new <linux/dmaengine.h> header shouldn't be included from the !__KERNEL__ portion of tcp.h Signed-off-by: David Woodhouse <dwmw2@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23[NET]: Added GSO toggleHerbert Xu
This patch adds a generic segmentation offload toggle that can be turned on/off for each net device. For now it only supports in TCPv4. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23[NET]: Add software TSOv4Herbert Xu
This patch adds the GSO implementation for IPv4 TCP. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23[NET]: Add generic segmentation offloadHerbert Xu
This patch adds the infrastructure for generic segmentation offload. The idea is to tap into the potential savings of TSO without hardware support by postponing the allocation of segmented skb's until just before the entry point into the NIC driver. The same structure can be used to support software IPv6 TSO, as well as UFO and segmentation offload for other relevant protocols, e.g., DCCP. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23[NET]: Merge TSO/UFO fields in sk_buffHerbert Xu
Having separate fields in sk_buff for TSO/UFO (tso_size/ufo_size) is not going to scale if we add any more segmentation methods (e.g., DCCP). So let's merge them. They were used to tell the protocol of a packet. This function has been subsumed by the new gso_type field. This is essentially a set of netdev feature bits (shifted by 16 bits) that are required to process a specific skb. As such it's easy to tell whether a given device can process a GSO skb: you just have to and the gso_type field and the netdev's features field. I've made gso_type a conjunction. The idea is that you have a base type (e.g., SKB_GSO_TCPV4) that can be modified further to support new features. For example, if we add a hardware TSO type that supports ECN, they would declare NETIF_F_TSO | NETIF_F_TSO_ECN. All TSO packets with CWR set would have a gso_type of SKB_GSO_TCPV4 | SKB_GSO_TCPV4_ECN while all other TSO packets would be SKB_GSO_TCPV4. This means that only the CWR packets need to be emulated in software. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23[NET]: Avoid allocating skb in skb_padHerbert Xu
First of all it is unnecessary to allocate a new skb in skb_pad since the existing one is not shared. More importantly, our hard_start_xmit interface does not allow a new skb to be allocated since that breaks requeueing. This patch uses pskb_expand_head to expand the existing skb and linearize it if needed. Actually, someone should sift through every instance of skb_pad on a non-linear skb as they do not fit the reasons why this was originally created. Incidentally, this fixes a minor bug when the skb is cloned (tcpdump, TCP, etc.). As it is skb_pad will simply write over a cloned skb. Because of the position of the write it is unlikely to cause problems but still it's best if we don't do it. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23Merge branch 'upstream-linus' of ↵Linus Torvalds
master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6 * 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6: (33 commits) [PATCH] myri10ge - drop workaround pci_save_state() disabling MSI [PATCH] myri10ge - drop workaround for the missing AER ext cap on nVidia CK804 via-velocity: the link is not correctly detected when the device starts [PATCH] add b44 to maintainers [PATCH] WAN: ioremap() failure checks in drivers [PATCH] WAN: register_hdlc_device() doesn't need dev_alloc_name() [PATCH] skb_padto()-area fixes in 8390, wavelan [PATCH] make drivers/net/forcedeth.c:nv_update_pause() static [PATCH] network driver for Hilscher netx [PATCH] Dereference in tokenring/olympic.c [PATCH] Array overrun in drivers/net/wireless/wavelan.c [PATCH] Remove useless check in drivers/net/pcmcia/xirc2ps_cs.c [PATCH] 8139cp: add ethtool eeprom support [PATCH] 8139cp: fix eeprom read command length [PATCH] b44: update b44 Kconfig entry [PATCH] b44: update version to 1.01 [PATCH] b44: add wol for old nic [PATCH] b44: add parameter [PATCH] b44: add wol [PATCH] b44: fix manual speed/duplex/autoneg settings ...
2006-06-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpcLinus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (139 commits) [POWERPC] re-enable OProfile for iSeries, using timer interrupt [POWERPC] support ibm,extended-*-frequency properties [POWERPC] Extra sanity check in EEH code [POWERPC] Dont look for class-code in pci children [POWERPC] Fix mdelay badness on shared processor partitions [POWERPC] disable floating point exceptions for init [POWERPC] Unify ppc syscall tables [POWERPC] mpic: add support for serial mode interrupts [POWERPC] pseries: Print PCI slot location code on failure [POWERPC] spufs: one more fix for 64k pages [POWERPC] spufs: fail spu_create with invalid flags [POWERPC] spufs: clear class2 interrupt status before wakeup [POWERPC] spufs: fix Makefile for "make clean" [POWERPC] spufs: remove stop_code from struct spu [POWERPC] spufs: fix spu irq affinity setting [POWERPC] spufs: further abstract priv1 register access [POWERPC] spufs: split the Cell BE support into generic and platform dependant parts [POWERPC] spufs: dont try to access SPE channel 1 count [POWERPC] spufs: use kzalloc in create_spu [POWERPC] spufs: fix initial state of wbox file ... Manually resolved conflicts in: drivers/net/phy/Makefile include/asm-powerpc/spu.h
2006-06-23[libata] Add host lock to struct ata_portJeff Garzik
Prepare for changes required to support SATA devices attached to SAS HBAs. For these devices we don't want to use host_set at all, since libata will not be the owner of struct scsi_host. Signed-off-by: Brian King <brking@us.ibm.com> (with slight merge modifications made by...) Signed-off-by: Jeff Garzik <jeff@garzik.org>
2006-06-23[PATCH] libata: implement per-dev EH action mask eh_info->dev_action[]Tejun Heo
Currently, the only per-dev EH action is REVALIDATE. EH used to exploit ehi->dev to do selective revalidation on a ATA bus. However, this is a bit hacky and makes it impossible to request selective revalidation from outside of EH or add another per-dev EH action. This patch adds per-dev EH action mask eh_info->dev_action[] and update EH to use this field for REVALIDATE. Note that per-dev actions can still be specified at port-level and it has the same effect of specifying the action for all devices on the port. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2006-06-23[PATCH] WAN: register_hdlc_device() doesn't need dev_alloc_name()Krzysztof Halasa
David Boggs noticed that register_hdlc_device() no longer needs to call dev_alloc_name() as it's called by register_netdev(). register_hdlc_device() is currently equivalent to register_netdev(). hdlc_setup() is now EXPORTed as per David's request. Signed-off-by: Krzysztof Halasa <khc@pm.waw.pl> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2006-06-23Merge branch 'master' into upstreamJeff Garzik
Conflicts: drivers/scsi/libata-core.c drivers/scsi/libata-scsi.c include/linux/pci_ids.h
2006-06-22Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/i2c-2.6Linus Torvalds
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/i2c-2.6: (44 commits) [PATCH] I2C: I2C controllers go into right place on sysfs [PATCH] hwmon-vid: Add support for Intel Core and Conroe [PATCH] lm70: New hardware monitoring driver [PATCH] hwmon: Fix the Kconfig header [PATCH] i2c-i801: Merge setup function [PATCH] i2c-i801: Better pci subsystem integration [PATCH] i2c-i801: Cleanups [PATCH] i2c-i801: Remove PCI function check [PATCH] i2c-i801: Remove force_addr parameter [PATCH] i2c-i801: Fix block transaction poll loops [PATCH] scx200_acb: Documentation update [PATCH] scx200_acb: Mark scx200_acb_probe __init [PATCH] scx200_acb: Use PCI I/O resource when appropriate [PATCH] i2c: Mark block write buffers as const [PATCH] i2c-ocores: Minor cleanups [PATCH] abituguru: Fix fan detection [PATCH] abituguru: Review fixes [PATCH] abituguru: New hardware monitoring driver [PATCH] w83792d: Add missing data access locks [PATCH] w83792d: Fix setting the PWM value ...
2006-06-22Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/w1-2.6Linus Torvalds
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/w1-2.6: [PATCH] w1: warning fix [PATCH] w1: clean up W1_CON dependency. [PATCH] drivers/w1/w1.c: fix a compile error [PATCH] W1: fix dependencies of W1_SLAVE_DS2433_CRC [PATCH] W1: possible cleanups [PATCH] W1: cleanups [PATCH] w1 exports [PATCH] w1: Use mutexes instead of semaphores. [PATCH] w1: Make w1 connector notifications depend on connector. [PATCH] w1: netlink: Mark netlink group 1 as unused. [PATCH] w1: Move w1-connector definitions into linux/include/connector.h [PATCH] w1: Userspace communication protocol over connector. [PATCH] w1: Replace dscore and ds_w1_bridge with ds2490 driver. [PATCH] w1: Added default generic read/write operations.
2006-06-22Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6Linus Torvalds
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6: (27 commits) [PATCH] PCI: nVidia quirk to make AER PCI-E extended capability visible [PATCH] PCI: fix issues with extended conf space when MMCONFIG disabled because of e820 [PATCH] PCI: Bus Parity Status sysfs interface [PATCH] PCI: fix memory leak in MMCONFIG error path [PATCH] PCI: fix error with pci_get_device() call in the mpc85xx driver [PATCH] PCI: MSI-K8T-Neo2-Fir: run only where needed [PATCH] PCI: fix race with pci_walk_bus and pci_destroy_dev [PATCH] PCI: clean up pci documentation to be more specific [PATCH] PCI: remove unneeded msi code [PATCH] PCI: don't move ioapics below PCI bridge [PATCH] PCI: cleanup unused variable about msi driver [PATCH] PCI: disable msi mode in pci_disable_device [PATCH] PCI: Allow MSI to work on kexec kernel [PATCH] PCI: AMD 8131 MSI quirk called too late, bus_flags not inherited ? [PATCH] PCI: Move various PCI IDs to header file [PATCH] PCI Bus Parity Status-broken hardware attribute, EDAC foundation [PATCH] PCI: i386/x86_84: disable PCI resource decode on device disable [PATCH] PCI ACPI: Rename the functions to avoid multiple instances. [PATCH] PCI: don't enable device if already enabled [PATCH] PCI: Add a "enable" sysfs attribute to the pci devices to allow userspace (Xorg) to enable devices without doing foul direct access ...
2006-06-22[PATCH] zlib_inflate: Upgrade library code to a recent versionRichard Purdie
Upgrade the zlib_inflate implementation in the kernel from a patched version 1.1.3/4 to a patched 1.2.3. The code in the kernel is about seven years old and I noticed that the external zlib library's inflate performance was significantly faster (~50%) than the code in the kernel on ARM (and faster again on x86_32). For comparison the newer deflate code is 20% slower on ARM and 50% slower on x86_32 but gives an approx 1% compression ratio improvement. I don't consider this to be an improvement for kernel use so have no plans to change the zlib_deflate code. Various changes have been made to the zlib code in the kernel, the most significant being the extra functions/flush option used by ppp_deflate. This update reimplements the features PPP needs to ensure it continues to work. This code has been tested on ARM under both JFFS2 (with zlib compression enabled) and ppp_deflate and on x86_32. JFFS2 sees an approx. 10% real world file read speed improvement. This patch also removes ZLIB_VERSION as it no longer has a correct value. We don't need version checks anyway as the kernel's module handling will take care of that for us. This removal is also more in keeping with the zlib author's wishes (http://www.zlib.net/zlib_faq.html#faq24) and I've added something to the zlib.h header to note its a modified version. Signed-off-by: Richard Purdie <rpurdie@rpsys.net> Acked-by: Joern Engel <joern@wh.fh-wedel.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-22[PATCH] Fix dcache race during umountNeilBrown
The race is that the shrink_dcache_memory shrinker could get called while a filesystem is being unmounted, and could try to prune a dentry belonging to that filesystem. If it does, then it will call in to iput on the inode while the dentry is no longer able to be found by the umounting process. If iput takes a while, generic_shutdown_super could get all the way though shrink_dcache_parent and shrink_dcache_anon and invalidate_inodes without ever waiting on this particular inode. Eventually the superblock gets freed anyway and if the iput tried to touch it (which some filesystems certainly do), it will lose. The promised "Self-destruct in 5 seconds" doesn't lead to a nice day. The race is closed by holding s_umount while calling prune_one_dentry on someone else's dentry. As a down_read_trylock is used, shrink_dcache_memory will no longer try to prune the dentry of a filesystem that is being unmounted, and unmount will not be able to start until any such active prune_one_dentry completes. This requires that prune_dcache *knows* which filesystem (if any) it is doing the prune on behalf of so that it can be careful of other filesystems. shrink_dcache_memory isn't called it on behalf of any filesystem, and so is careful of everything. shrink_dcache_anon is now passed a super_block rather than the s_anon list out of the superblock, so it can get the s_anon list itself, and can pass the superblock down to prune_dcache. If prune_dcache finds a dentry that it cannot free, it leaves it where it is (at the tail of the list) and exits, on the assumption that some other thread will be removing that dentry soon. To try to make sure that some work gets done, a limited number of dnetries which are untouchable are skipped over while choosing the dentry to work on. I believe this race was first found by Kirill Korotaev. Cc: Jan Blunck <jblunck@suse.de> Acked-by: Kirill Korotaev <dev@openvz.org> Cc: Olaf Hering <olh@suse.de> Acked-by: Balbir Singh <balbir@in.ibm.com> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Balbir Singh <balbir@in.ibm.com> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-22[PATCH] remove steal_locks()Miklos Szeredi
This patch removes the steal_locks() function. steal_locks() doesn't work correctly with any filesystem that does it's own lock management, including NFS, CIFS, etc. In addition it has weird semantics on local filesystems in case tasks sharing file-descriptor tables are doing POSIX locking operations in parallel to execve(). The steal_locks() function has an effect on applications doing: clone(CLONE_FILES) /* in child */ lock execve lock POSIX locks acquired before execve (by "child", "parent" or any further task sharing files_struct) will after the execve be owned exclusively by "child". According to Chris Wright some LSB/LTP kind of suite triggers without the stealing behavior, but there's no known real-world application that would also fail. Apps using NPTL are not affected, since all other threads are killed before execve. Apps using LinuxThreads are only affected if they - have multiple threads during exec (LinuxThreads doesn't kill other threads, the app may do it with pthread_kill_other_threads_np()) - rely on POSIX locks being inherited across exec Both conditions are documented, but not their interaction. Apps using clone() natively are affected if they - use clone(CLONE_FILES) - rely on POSIX locks being inherited across exec The above scenarios are unlikely, but possible. If the patch is vetoed, there's a plan B, that involves mostly keeping the weird stealing semantics, but changing the way lock ownership is handled so that network and local filesystems work consistently. That would add more complexity though, so this solution seems to be preferred by most people. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Matthew Wilcox <willy@debian.org> Cc: Chris Wright <chrisw@sous-sol.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Steven French <sfrench@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-22[PATCH] PCI: Add PCI_CAP_ID_VNDRBrice Goglin
Add the vendor-specific extended capability PCI_CAP_ID_VNDR. It is required by the Myri-10G Ethernet driver. Signed-off-by: Brice Goglin <brice@myri.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Cc: Jeff Garzik <jeff@garzik.org> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-22[PATCH] Keys: Fix race between two instantiators of a keyDavid Howells
Add a revocation notification method to the key type and calls it whilst the key's semaphore is still write-locked after setting the revocation flag. The patch then uses this to maintain a reference on the task_struct of the process that calls request_key() for as long as the authorisation key remains unrevoked. This fixes a potential race between two processes both of which have assumed the authority to instantiate a key (one may have forked the other for example). The problem is that there's no locking around the check for revocation of the auth key and the use of the task_struct it points to, nor does the auth key keep a reference on the task_struct. Access to the "context" pointer in the auth key must thenceforth be done with the auth key semaphore held. The revocation method is called with the target key semaphore held write-locked and the search of the context process's keyrings is done with the auth key semaphore read-locked. The check for the revocation state of the auth key just prior to searching it is done after the auth key is read-locked for the search. This ensures that the auth key can't be revoked between the check and the search. The revocation notification method is added so that the context task_struct can be released as soon as instantiation happens rather than waiting for the auth key to be destroyed, thus avoiding the unnecessary pinning of the requesting process. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-22[PATCH] selinux: add hooks for key subsystemMichael LeMay
Introduce SELinux hooks to support the access key retention subsystem within the kernel. Incorporate new flask headers from a modified version of the SELinux reference policy, with support for the new security class representing retained keys. Extend the "key_alloc" security hook with a task parameter representing the intended ownership context for the key being allocated. Attach security information to root's default keyrings within the SELinux initialization routine. Has passed David's testsuite. Signed-off-by: Michael LeMay <mdlemay@epoch.ncsc.mil> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <jmorris@namei.org> Acked-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-22[PATCH] w1: netlink: Mark netlink group 1 as unused.Evgeniy Polyakov
netlink_w1 was moved to connector. Signed-off-by: Evgeniy Polyakov <johnpol@2ka.mipt.ru> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-22[PATCH] w1: Move w1-connector definitions into linux/include/connector.hEvgeniy Polyakov
Signed-off-by: Evgeniy Polyakov <johnpol@2ka.mipt.ru> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-22[PATCH] i2c: Mark block write buffers as constKrzysztof Halasa
The attached patch marks i2c_smbus_write_block_data() and i2c_smbus_write_i2c_block_data() buffers as const. Signed-off-by: Krzysztof Halasa <khc@pm.waw.pl> Signed-off-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-22[PATCH] i2c: New bus driver for the OpenCores I2C controllerPeter Korsgaard
The following patch adds support for the OpenCores I2C controller IP core (See http://www.opencores.org/projects.cgi/web/i2c/overview). Signed-off-by: Peter Korsgaard <jacmet@sunsite.dk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-22[PATCH] I2C: i2c-nforce2: Add support for the nForce4 MCP51 and MCP55Jean Delvare
Add support for the new nForce4 MCP51 (also known as nForce 410 or 430) and nForce4 MCP55 to the i2c-nforce2 driver. Some code changes were required because the base I/O address registers have changed in these versions. Standard BARs are now being used, while the original nForce2 chips used non-standard ones. Signed-off-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-22[PATCH] I2C: m41t00: Add support for the ST M41T81 and M41T85Mark A. Greer
This patch adds support for the ST m41t81 and m41t85 i2c rtc chips to the existing m41t00 driver. Since there is no way to reliably determine what type of rtc chip is in use, the chip type is passed in via platform_data. The i2c address and square wave frequency are passed in via platform_data as well. To accommodate the use of platform_data, a new header file include/linux/m41t00.h has been added. The m41t81 and m41t85 chips halt the updating of their time registers while they are being accessed. They resume when a stop condition exists on the i2c bus or when non-time related regs are accessed. To make the best use of that facility and to make more efficient use of the i2c bus, this patch replaces multiple i2c_smbus_xxx calls with a single i2c_transfer call. Signed-off-by: Mark A. Greer <mgreer@mvista.com> Signed-off-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-22[PATCH] i2c-piix4: Add ATI IXP200/300/400 supportRudolf Marek
This patch adds the ATI IXP southbridges support to i2c-piix4, as it turned out those chips are compatible with it. Signed-off-by: Rudolf Marek <r.marek@sh.cvut.cz> Signed-off-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-21[PATCH] USB: convert usb class devices to real devicesGreg Kroah-Hartman
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-21[PATCH] USB: move usb_device_class class devices to be real devicesGreg Kroah-Hartman
This moves the usb class devices that control the usbfs nodes to show up in the proper place in the larger device tree. No userspace changes is needed, this is compatible due to the symlinks generated by the driver core. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-21[PATCH] USB: make endpoints real struct devicesGreg Kroah-Hartman
This will allow for us to give endpoints a major/minor to create a "usbfs2-like" way to access endpoints directly from userspace in an easier manner than the current usbfs provides us. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-21[PATCH] USB: move <linux/usb_input.h> to <linux/usb/input.h>David Brownell
Move <linux/usb_input.h> to <linux/usb/input.h> and remove some redundant includes. Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-21[PATCH] USB: move hardware-specific <linux/usb_*.h> to <linux/usb/*.h>David Brownell
This moves header files for controller-specific platform data from <linux/usb_XXX.h> to <linux/usb/XXX.h> to start reducing some clutter. Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-21[PATCH] USB: move <linux/usb_cdc.h> to <linux/usb/cdc.h>David Brownell
This moves <linux/usb_cdc.h> to <linux/usb/cdc.h> to reduce some of the clutter of usb header files. Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-21[PATCH] usbcore: port reset for composite devicesAlan Stern
This patch (as699) adds usb_reset_composite_device(), a routine for sending a USB port reset to a device with multiple interfaces owned by different drivers. Drivers are notified about impending and completed resets through two new methods in the usb_driver structure. The patch modifieds the usbfs ioctl code to make it use the new routine instead of usb_reset_device(). Follow-up patches will modify the hub, usb-storage, and usbhid drivers so they can utilize this new API. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-21[PATCH] USB: add usb_interrupt_msg() function for api completeness.Greg Kroah-Hartman
Really just a wrapper around usb_bulk_msg() but now it's documented much better. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-21Pull rework-memory-attribute-aliasing into release branchTony Luck