summaryrefslogtreecommitdiff
path: root/drivers/block
AgeCommit message (Collapse)Author
2013-01-21Merge tag 'fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux Pull module fixes and a virtio block fix from Rusty Russell: "Various minor fixes, but a slightly more complex one to fix the per-cpu overload problem introduced recently by kvm id changes." * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: module: put modules in list much earlier. module: add new state MODULE_STATE_UNFORMED. module: prevent warning when finit_module a 0 sized file virtio-blk: Don't free ida when disk is in use
2013-01-03Drivers: block: remove __dev* attributes.Greg Kroah-Hartman
CONFIG_HOTPLUG is going away as an option. As a result, the __dev* markings need to be removed. This change removes the use of __devinit, __devexit_p, __devinitdata, __devinitconst, and __devexit from these drivers. Based on patches originally written by Bill Pemberton, but redone by me in order to handle some of the coding style issues better, by hand. Cc: Bill Pemberton <wfp5p@virginia.edu> Cc: Mike Miller <mike.miller@hp.com> Cc: Chirag Kantharia <chirag.kantharia@hp.com> Cc: Geoff Levand <geoff@infradead.org> Cc: Jim Paris <jim@jtan.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Grant Likely <grant.likely@secretlab.ca> Cc: Matthew Wilcox <matthew.r.wilcox@intel.com> Cc: Keith Busch <keith.busch@intel.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: NeilBrown <neilb@suse.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: Tao Guo <Tao.Guo@emc.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-02virtio-blk: Don't free ida when disk is in useAlexander Graf
When a file system is mounted on a virtio-blk disk, we then remove it and then reattach it, the reattached disk gets the same disk name and ids as the hot removed one. This leads to very nasty effects - mostly rendering the newly attached device completely unusable. Trying what happens when I do the same thing with a USB device, I saw that the sd node simply doesn't get free'd when a device gets forcefully removed. Imitate the same behavior for vd devices. This way broken vd devices simply are never free'd and newly attached ones keep working just fine. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: stable@kernel.org
2012-12-20Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client Pull Ceph update from Sage Weil: "There are a few different groups of commits here. The largest is Alex's ongoing work to enable the coming RBD features (cloning, striping). There is some cleanup in libceph that goes along with it. Cyril and David have fixed some problems with NFS reexport (leaking dentries and page locks), and there is a batch of patches from Yan fixing problems with the fs client when running against a clustered MDS. There are a few bug fixes mixed in for good measure, many of which will be going to the stable trees once they're upstream. My apologies for the late pull. There is still a gremlin in the rbd map/unmap code and I was hoping to include the fix for that as well, but we haven't been able to confirm the fix is correct yet; I'll send that in a separate pull once it's nailed down." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (68 commits) rbd: get rid of rbd_{get,put}_dev() libceph: register request before unregister linger libceph: don't use rb_init_node() in ceph_osdc_alloc_request() libceph: init event->node in ceph_osdc_create_event() libceph: init osd->o_node in create_osd() libceph: report connection fault with warning libceph: socket can close in any connection state rbd: don't use ENOTSUPP rbd: remove linger unconditionally rbd: get rid of RBD_MAX_SEG_NAME_LEN libceph: avoid using freed osd in __kick_osd_requests() ceph: don't reference req after put rbd: do not allow remove of mounted-on image libceph: Unlock unprocessed pages in start_read() error path ceph: call handle_cap_grant() for cap import message ceph: Fix __ceph_do_pending_vmtruncate ceph: Don't add dirty inode to dirty list if caps is in migration ceph: Fix infinite loop in __wake_requests ceph: Don't update i_max_size when handling non-auth cap bdi_register: add __printf verification, fix arg mismatch ...
2012-12-20rbd: get rid of rbd_{get,put}_dev()Alex Elder
The functions rbd_get_dev() and rbd_put_dev() are trivial wrappers that add no value, and their existence suggests they may do more than what they do. Get rid of them. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Dan Mick <dan.mick@inktank.com>
2012-12-18Merge branch 'akpm' (Andrew's patch-bomb)Linus Torvalds
Merge misc patches from Andrew Morton: "Incoming: - lots of misc stuff - backlight tree updates - lib/ updates - Oleg's percpu-rwsem changes - checkpatch - rtc - aoe - more checkpoint/restart support I still have a pile of MM stuff pending - Pekka should be merging later today after which that is good to go. A number of other things are twiddling thumbs awaiting maintainer merges." * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (180 commits) scatterlist: don't BUG when we can trivially return a proper error. docs: update documentation about /proc/<pid>/fdinfo/<fd> fanotify output fs, fanotify: add @mflags field to fanotify output docs: add documentation about /proc/<pid>/fdinfo/<fd> output fs, notify: add procfs fdinfo helper fs, exportfs: add exportfs_encode_inode_fh() helper fs, exportfs: escape nil dereference if no s_export_op present fs, epoll: add procfs fdinfo helper fs, eventfd: add procfs fdinfo helper procfs: add ability to plug in auxiliary fdinfo providers tools/testing/selftests/kcmp/kcmp_test.c: print reason for failure in kcmp_test breakpoint selftests: print failure status instead of cause make error kcmp selftests: print fail status instead of cause make error kcmp selftests: make run_tests fix mem-hotplug selftests: print failure status instead of cause make error cpu-hotplug selftests: print failure status instead of cause make error mqueue selftests: print failure status instead of cause make error vm selftests: print failure status instead of cause make error ubifs: use prandom_bytes mtd: nandsim: use prandom_bytes ...
2012-12-18aoe: fix use after free in aoedev_by_aoeaddr()Dan Carpenter
We should return NULL on failure instead of returning a freed pointer. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: Ed Cashin <ecashin@coraid.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-18aoe: update internal version number to 81Ed Cashin
This version number is printed to the console on module initialization and is available in sysfs, which is where the userland aoe-version tool looks for it. Signed-off-by: Ed Cashin <ecashin@coraid.com> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-18aoe: identify source of runt AoE packetsEd Cashin
This change only affects experimental AoE storage networks. It modifies the console message about runt packets detected so that the AoE major and minor addresses of the AoE target that generated the runt are mentioned. Signed-off-by: Ed Cashin <ecashin@coraid.com> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-18aoe: allow comma separator in aoe_iflist valueEd Cashin
By default, the aoe driver uses any ethernet interface for AoE, but the aoe_iflist module parameter provides a convenient way to limit AoE traffic to a specific list of local network interfaces. This change allows a list to be specified using the comma character as a separator. For example, modprobe aoe aoe_iflist=eth2,eth3 Before, it was inconvenient to get the quoting right in shell scripts when setting aoe_iflist to have more than one network interface. Signed-off-by: Ed Cashin <ecashin@coraid.com> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-18aoe: allow user to disable target failure timeoutEd Cashin
With this change, the aoe driver treats the value zero as special for the aoe_deadsecs module parameter. Normally, this value specifies the number of seconds during which the driver will continue to attempt retransmits to an unresponsive AoE target. After aoe_deadsecs has elapsed, the aoe driver marks the aoe device as "down" and fails all I/O. The new meaning of an aoe_deadsecs of zero is for the driver to retransmit commands indefinitely. Signed-off-by: Ed Cashin <ecashin@coraid.com> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-18aoe: use dynamic number of remote ports for AoE storage targetEd Cashin
Many AoE targets have four or fewer network ports, but some existing storage devices have many, and the AoE protocol sets no limit. This patch allows the use of more than eight remote MAC addresses per AoE target, while reducing the amount of memory used by the aoe driver in cases where there are many AoE targets with fewer than eight MAC addresses each. Signed-off-by: Ed Cashin <ecashin@coraid.com> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-18aoe: avoid races between device destruction and discoveryEd Cashin
This change avoids a race that could result in a NULL pointer derference following a WARNing from kobject_add_internal, "don't try to register things with the same name in the same directory." The problem was found with a test that forgets and discovers an aoe device in a loop: while test ! -r /tmp/stop; do aoe-flush -a aoe-discover done The race was between aoedev_flush taking aoedevs out of the devlist, allowing a new discovery of the same AoE target to take place before the driver gets around to calling sysfs_remove_group. Fixing that one revealed another race between do_open and add_disk, and this patch avoids that, too. The fix required some care, because for flushing (forgetting) an aoedev, some of the steps must be performed under lock and some must be able to sleep. Also, for discovering a new aoedev, some steps might sleep. The check for a bad aoedev pointer remains from a time when about half of this patch was done, and it was possible for the bdev->bd_disk->private_data to become corrupted. The check should be removed eventually, but it is not expected to add significant overhead, occurring in the aoeblk_open routine. Signed-off-by: Ed Cashin <ecashin@coraid.com> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-18aoe: improve handling of misbehaving network pathsEd Cashin
An AoE target can have multiple network ports used for AoE, and in the aoe driver, those are tracked by the aoetgt struct. These changes allow the aoe driver to handle network paths, or aoetgts, that are not working well, compared to the others. Paths that do not get responses despite the retransmission of AoE commands are marked as "tainted", and non-tainted paths are preferred. Meanwhile, the aoe driver attempts to "probe" the tainted path in the background by issuing reads of LBA 0 that are padded out to full (possibly jumbo-frame) size. If the probes get responses, then the path is "redeemed", and its taint is removed. This mechanism has been shown to be helpful in transparently handling and recovering from real-world network "brown outs" in ways that the earlier "shoot the help-needing target in the head" mechanism could not. Signed-off-by: Ed Cashin <ecashin@coraid.com> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-18aoe: return real minor number for static minorsEd Cashin
The value returned by the static minor device number number allocator is the real minor number, so it must be multiplied by the supported number of partitions per aoedev. Without this fix the support for systems without udev is incomplete, and the few users of aoe on such systems will have surprising results when device nodes names do not match the AoE target. Signed-off-by: Ed Cashin <ecashin@coraid.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-18aoe: initialize sysminor to avoid compiler warningEd Cashin
Because the minor_get and related functions use the return values for errors, the compiler doesn't know that sysminor will always either 1) be initialized in aoedev_by_aoeaddr by the call to minor_get, or 2) be unused as the "goto out" is executed. This patch avoids the compiler warning. Signed-off-by: Ed Cashin <ecashin@coraid.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-18aoe: make error messages more specific in static minor allocationEd Cashin
For some special-purpose systems where udev isn't present, static allocation of minor numbers is desirable. This update distinguishes different failure scenarios, to help the user understand what went wrong. Signed-off-by: Ed Cashin <ecashin@coraid.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-18aoe: remove call to request handler from I/O completionEd Cashin
There is no need to call the request handler function in the I/O completion routine. The user impact of not doing it is a more "nice" aoe driver that is less susceptible to causing soft lockups. Signed-off-by: Ed Cashin <ecashin@coraid.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-18aoe: cleanup: correct comment for aoetgt noutEd Cashin
A misplaced comment was attached to the nout member of the aoetgt. This change corrects the comment. Signed-off-by: Ed Cashin <ecashin@coraid.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-18aoe: increase default cap on outstanding AoE commands in the networkEd Cashin
The aoe driver will never be waiting for more than aoe_maxout AoE commands from a given remote network port on an AoE target. Increasing the cap increases performance. Users can tighten the setting to reduce the amount of memory used for handling AoE traffic or the network bandwidth used for AoE. Signed-off-by: Ed Cashin <ecashin@coraid.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-18aoe: remove vestigial request queue allocationEd Cashin
Before the aoe driver was an I/O request handler, it was a make_request-style block driver. Even so, there was a problem where sysfs expected a request queue to exist, so one was provided in commit 7135a71b19be ("aoe: allocate unused request_queue for sysfs"). During the transition to the request-handler style, a patch was merged that was based on a driver without the noop queue, and the noop queue remained in place after the patch was merged, even though a new functional queue was introduced by the patch, allocated through blk_init_queue. The user impact is a memory leak proportional to the number of AoE targets discovered. This patch removes the memory leak and cleans up vestiges of the old do-nothing queue from the aoeblk_gdalloc function. Signed-off-by: Ed Cashin <ecashin@coraid.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-18aoe: copy fallback timing information on destination failoverEd Cashin
Commit f3b8e07af774 ("aoe: commands in retransmit queue use new destination on failure") omits the copying of the coarse-grained time when an AoE command was sent during the failover from one destination MAC address on the AoE target to another. The coarse-grained timing is only used when the system time changes or an unlikely length of time has passed since the sending of the AoE command. Users will not be impacted unless their system clock is very inaccurate or something unusual (e.g., 10 GbE link reset) happens during the period when the aoe driver is handling the failure of a port on the AoE target. Being effected will mean that an AoE target could be considered "down" too eagerly. Signed-off-by: Ed Cashin <ecashin@coraid.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-18aoe: update driver-internal version to 64+Ed Cashin
Signed-off-by: Ed Cashin <ecashin@coraid.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-18aoe: commands in retransmit queue use new destination on failureEd Cashin
When one remote MAC address isn't working as a destination for AoE commands, the frames used to track information associated with the AoE commands are moved to a new aoetgt (defined by the tuple of {AoE major, AoE minor, target MAC address}). This patch makes sure that the frames on the queue for retransmits that need to be done are updated to use the new destination, so that retransmits will be sent through a working network path. Without this change, packets on the retransmit queue will be needlessly retransmitted to the unresponsive destination MAC, possibly causing premature target failure before there's time for the retransmit timer to run again, decide to retransmit again, and finally update the destination to a working MAC address on the AoE target. Signed-off-by: Ed Cashin <ecashin@coraid.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-18aoe: use high-resolution RTTs with fallback to low-resEd Cashin
These changes improve the accuracy of the decision about whether it's time to retransmit an AoE command by using the microsecond-resolution gettimeofday instead of jiffies. Because the system time can jump suddenly, the decision reverts to using jiffies if the high-resolution time difference is relatively large. Otherwise the AoE targets could be considered failed inappropriately. Signed-off-by: Ed Cashin <ecashin@coraid.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-18aoe: manipulate aoedev network stats under lockEd Cashin
With this bugfix in place the calculation of the criterion for "lateness" is performed under lock. Without the lock, there is a chance that one of the non-atomic operations performed on the round trip time statistics could be incomplete, such that an incorrect lateness criterion would be calculated. Without this change, the effect of the bug would be rare unecessary but benign retransmissions. Signed-off-by: Ed Cashin <ecashin@coraid.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-18aoe: err device: include MAC addresses for unexpected responsesEd Cashin
The /dev/etherd/err character device provides low-level information about normal but sometimes interesting AoE command retransmits and "unexpected responses", i.e., responses for packets that have already been retransmitted. This change adds MAC addresses to the messages about unexpected responses, so that when they occur, it's more easy to determine the network paths to which they belong. Signed-off-by: Ed Cashin <ecashin@coraid.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-18aoe: improve network congestion handlingEd Cashin
The aoe driver already had some congestion handling, but it was limited in its ability to cope with the kind of congestion that can arise on more complex networks such as those involving paths through multiple ethernet switches. Some of the lessons from TCP's history of development can be applied to improving the congestion control and avoidance on AoE storage networks. These changes use familar concepts from Van Jacobson's "Congestion Avoidance and Control" paper from '88, without adding significant overhead. This patch depends on an upcoming patch that covers the failover case when AoE commands being retransmitted are transferred from one retransmit queue to another. Another upcoming patch increases the timing accuracy. Signed-off-by: Ed Cashin <ecashin@coraid.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-18aoe: provide ATA identify device content to user on requestEd Cashin
Make the aoe driver follow expected behavior when the user uses ioctl to get the ATA device identify information, allowing access to model, serial number, etc. Signed-off-by: Ed Cashin <ecashin@coraid.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-18aoe: update driver-internal version number to 60Ed Cashin
Signed-off-by: Ed Cashin <ecashin@coraid.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-18aoe: whitespace cleanupEd Cashin
Signed-off-by: Ed Cashin <ecashin@coraid.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-18aoe: cleanup: remove unused ata_scnt functionEd Cashin
Signed-off-by: Ed Cashin <ecashin@coraid.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-18aoe: "payload" sysfs file exports per-AoE-command data transfer sizeEd Cashin
The userland aoetools package includes an "aoe-stat" command that can display a "payload size" column when the aoe driver exports this information. Users can quickly see what amount of user data is transferred inside each AoE command on the network, network headers excluded. Signed-off-by: Ed Cashin <ecashin@coraid.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-18aoe: support larger I/O requests via aoe_maxsectors module paramEd Cashin
The GPFS filesystem is an example of an aoe user that requires the aoe driver to support I/O request sizes larger than the default. Most users will not need large I/O request sizes, because they would need to be split up into multiple AoE commands anyway. Signed-off-by: Ed Cashin <ecashin@coraid.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-18aoe: support the forgetting (flushing) of a user-specified AoE targetEd Cashin
Users sometimes want to cause the aoe driver to forget a particular previously discovered device when it is no longer online. The aoetools provide an "aoe-flush" command that users run to perform this administrative task. The changes below provide the support needed in the driver. Signed-off-by: Ed Cashin <ecashin@coraid.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-18aoe: update cap on outstanding commands based on config query responseEd Cashin
The ATA over Ethernet config query response contains a "buffer count" field reflecting the AoE target's capacity to buffer incoming AoE commands. By taking the current value of this field into accound, we increase performance throughput or avoid network congestion, when the value has increased or decreased, respectively. Signed-off-by: Ed Cashin <ecashin@coraid.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-18aoe: print warning regarding a common reason for dropped transmitsEd Cashin
Dropped transmits are not common, but when they do occur, increasing the transmit queue length often helps. Signed-off-by: Ed Cashin <ecashin@coraid.com> Cc: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-18aoe: describe the behavior of the "err" character deviceEd Cashin
Signed-off-by: Ed Cashin <ecashin@coraid.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-17Merge branch 'for-3.8/drivers' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block driver update from Jens Axboe: "Now that the core bits are in, here are the driver bits for 3.8. The branch contains: - A huge pile of drbd bits that were dumped from the 3.7 merge window. Following that, it was both made perfectly clear that there is going to be no more over-the-wall pulls and how the situation on individual pulls can be improved. - A few cleanups from Akinobu Mita for drbd and cciss. - Queue improvement for loop from Lukas. This grew into adding a generic interface for waiting/checking an even with a specific lock, allowing this to be pulled out of md and now loop and drbd is also using it. - A few fixes for xen back/front block driver from Roger Pau Monne. - Partition improvements from Stephen Warren, allowing partiion UUID to be used as an identifier." * 'for-3.8/drivers' of git://git.kernel.dk/linux-block: (609 commits) drbd: update Kconfig to match current dependencies drbd: Fix drbdsetup wait-connect, wait-sync etc... commands drbd: close race between drbd_set_role and drbd_connect drbd: respect no-md-barriers setting also when changed online via disk-options drbd: Remove obsolete check drbd: fixup after wait_even_lock_irq() addition to generic code loop: Limit the number of requests in the bio list wait: add wait_event_lock_irq() interface xen-blkfront: free allocated page xen-blkback: move free persistent grants code block: partition: msdos: provide UUIDs for partitions init: reduce PARTUUID min length to 1 from 36 block: store partition_meta_info.uuid as a string cciss: use check_signature() cciss: cleanup bitops usage drbd: use copy_highpage drbd: if the replication link breaks during handshake, keep retrying drbd: check return of kmalloc in receive_uuids drbd: Broadcast sync progress no more often than once per second drbd: don't try to clear bits once the disk has failed ...
2012-12-17rbd: don't use ENOTSUPPAlex Elder
ENOTSUPP is not a standard errno (it shows up as "Unknown error 524" in an error message). This is what was getting produced when the the local rbd code does not implement features required by a discovered rbd image. Change the error code returned in this case to ENXIO. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
2012-12-17rbd: get rid of RBD_MAX_SEG_NAME_LENAlex Elder
RBD_MAX_SEG_NAME_LEN represents the maximum length of an rbd object name (i.e., one of the objects providing storage backing an rbd image). Another symbol, MAX_OBJ_NAME_SIZE, is used in the osd client code to define the maximum length of any object name in an osd request. Right now they disagree, with RBD_MAX_SEG_NAME_LEN being too big. There's no real benefit at this point to defining the rbd object name length limit separate from any other object name, so just get rid of RBD_MAX_SEG_NAME_LEN and use MAX_OBJ_NAME_SIZE in its place. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
2012-12-17rbd: do not allow remove of mounted-on imageAlex Elder
There is no check in rbd_remove() to see if anybody holds open the image being removed. That's not cool. Add a simple open count that goes up and down with opens and closes (releases) of the device, and don't allow an rbd image to be removed if the count is non-zero. Protect the updates of the open count value with ctl_mutex to ensure the underlying rbd device doesn't get removed while concurrently being opened. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
2012-12-06drbd: update Kconfig to match current dependenciesLars Ellenberg
We no longer need the connector. But we need libcrc32c. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-12-06drbd: Fix drbdsetup wait-connect, wait-sync etc... commandsPhilipp Reisner
This was introduces when moving the code over from the 8.3 codebase with commit 328e0f125bf41f4f33f684db22015f92cb44fe56 Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-12-06drbd: close race between drbd_set_role and drbd_connectPhilipp Reisner
drbd_set_role(, R_PRIMARY, ) does the state change to Primary, some more housekeeping, and possibly generates a new UUID set. All of this holding the "state_mutex". The connection handshake involves sending of various state information, including the current data generation UUID set, and two connection state changes from C_WF_CONNECTION to C_WF_REPORT_PARAMS further to a number of different outcomes, resync being one of them. If the connection handshake happens between the state change to Primary and the generation of the new UUIDs, the resync decision based on the old UUID set may be confused, depending on circumstances. Make sure that, before we do the handshake, any promotion to Primary role will either be complete (including the housekeeping stuff), or can see, and serialize with, the ongoing handshake, based on the "STATE_SENT" bit, which is set when we start the handshake, and cleared only when we leave C_WF_REPORT_PARAMS again. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-12-06drbd: respect no-md-barriers setting also when changed online via disk-optionsLars Ellenberg
We need to propagate the configuration into the flag bits, or it won't be effective. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-12-06drbd: Remove obsolete checkPhilipp Reisner
Smatch complained about it this redundanct check. The check was introduced in 2006-09-13. On 2007-07-24 the body of the function was enclosed by get_ldev()/put_ldev() reference counting. Since then the check is useless and miss leading. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-12-01Merge branch 'stable/for-jens-3.8' of ↵Jens Axboe
git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen into for-3.8/drivers
2012-11-30drbd: fixup after wait_even_lock_irq() addition to generic codeJens Axboe
Compiling drbd yields: drivers/block/drbd/drbd_state.c: In function ‘_conn_request_state’: drivers/block/drbd/drbd_state.c:1804:5: error: macro "wait_event_lock_irq" passed 4 arguments, but takes just 3 drivers/block/drbd/drbd_state.c:1801:3: error: ‘wait_event_lock_irq’ undeclared (first use in this function) drivers/block/drbd/drbd_state.c:1801:3: note: each undeclared identifier is reported only once for each function it appears in drivers/block/drbd/drbd_state.c: At top level: drivers/block/drbd/drbd_state.c:1734:1: warning: ‘_conn_rq_cond’ defined but not used [-Wunused-function] Due to drbd having copied the MD definition for wait_event_lock_irq() as well. Kill them. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-11-30loop: Limit the number of requests in the bio listLukas Czerner
Currently there is not limitation of number of requests in the loop bio list. This can lead into some nasty situations when the caller spawns tons of bio requests taking huge amount of memory. This is even more obvious with discard where blkdev_issue_discard() will submit all bios for the range and wait for them to finish afterwards. On really big loop devices and slow backing file system this can lead to OOM situation as reported by Dave Chinner. With this patch we will wait in loop_make_request() if the number of bios in the loop bio list would exceed 'nr_congestion_on'. We'll wake up the process as we process the bios form the list. Some threshold hysteresis is in place to avoid high frequency oscillation. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Reported-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>