summaryrefslogtreecommitdiff
path: root/drivers/block/drbd/drbd_worker.c
AgeCommit message (Collapse)Author
2014-02-17drbd: Rename w_prev_work_done -> w_completeAndreas Gruenbacher
Also move it to drbd_receiver.c and make it static. Signed-off-by: Andreas Gruenbacher <agruen@linbit.com> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
2014-02-17drbd: Pass a peer device to a number of fuctionsAndreas Gruenbacher
These functions actually operate on a peer device, or need a peer device. drbd_prepare_command(), drbd_send_command(), drbd_send_sync_param() drbd_send_uuids(), drbd_gen_and_send_sync_uuid(), drbd_send_sizes() drbd_send_state(), drbd_send_current_state(), and drbd_send_state_req() drbd_send_sr_reply(), drbd_send_ack(), drbd_send_drequest(), drbd_send_drequest_csum(), drbd_send_ov_request(), drbd_send_dblock() drbd_send_block(), drbd_send_out_of_sync(), recv_dless_read() drbd_drain_block(), receive_bitmap_plain(), recv_resync_read() read_in_block(), read_for_csum(), drbd_alloc_pages(), drbd_alloc_peer_req() need_peer_seq(), update_peer_seq(), wait_for_and_update_peer_seq() drbd_sync_handshake(), drbd_asb_recover_{0,1,2}p(), drbd_connected() drbd_disconnected(), decode_bitmap_c() and recv_bm_rle_bits() Signed-off-by: Andreas Gruenbacher <agruen@linbit.com> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
2014-02-17drbd: drbd_csum_bio(), drbd_csum_ee(): Remove unused device argumentAndreas Gruenbacher
Signed-off-by: Andreas Gruenbacher <agruen@linbit.com> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
2014-02-17drbd: Move conf_mutex from connection to resourceAndreas Gruenbacher
Signed-off-by: Andreas Gruenbacher <agruen@linbit.com> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
2014-02-17drbd: Add explicit device parameter to D_ASSERTAndreas Gruenbacher
The implicit dependency on a variable inside the macro is problematic. Signed-off-by: Andreas Gruenbacher <agruen@linbit.com> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
2014-02-17drbd: Replace and remove the obsolete conn_() macrosAndreas Gruenbacher
With the polymorphic drbd_() macros, we no longer need the connection specific variants. Signed-off-by: Andreas Gruenbacher <agruen@linbit.com> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
2014-02-17drbd: Remove the terrible DEV hackAndreas Gruenbacher
DRBD was using dev_err() and similar all over the code; instead of having to write dev_err(disk_to_dev(device->vdisk), ...) to convert a drbd_device into a kernel device, a DEV macro was used which implicitly references the device variable. This is terrible; introduce separate drbd_err() and similar macros with an explicit device parameter instead. Signed-off-by: Andreas Gruenbacher <agruen@linbit.com> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
2014-02-17drbd: Turn connection->volumes into connection->peer_devicesAndreas Gruenbacher
Let connection->peer_devices point to peer devices; connection->volumes was pointing to devices. Signed-off-by: Andreas Gruenbacher <agruen@linbit.com> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
2014-02-17drbd: Improve some function and variable namingAndreas Gruenbacher
Rename functions conn_destroy() -> drbd_destroy_connection(), drbd_minor_destroy() -> drbd_destroy_device() drbd_adm_add_minor() -> drbd_adm_add_minor() drbd_adm_delete_minor() -> drbd_adm_del_minor() Rename global variable minors to drbd_devices Signed-off-by: Andreas Gruenbacher <agruen@linbit.com> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
2014-02-17drbd: Introduce "peer_device" object between "device" and "connection"Andreas Gruenbacher
In a setup where a device (aka volume) can replicate to multiple peers and one connection can be shared between multiple devices, we need separate objects to represent devices on peer nodes and network connections. As a first step to introduce multiple connections per device, give each drbd_device object a single drbd_peer_device object which connects it to a drbd_connection object. Signed-off-by: Andreas Gruenbacher <agruen@linbit.com> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
2014-02-17drbd: Rename drbd_tconn -> drbd_connectionAndreas Gruenbacher
sed -i -e 's:all_tconn:connections:g' -e 's:tconn:connection:g' Signed-off-by: Andreas Gruenbacher <agruen@linbit.com> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
2014-02-17drbd: Rename "mdev" to "device"Andreas Gruenbacher
sed -i -e 's:mdev:device:g' Signed-off-by: Andreas Gruenbacher <agruen@linbit.com> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
2014-02-17drbd: Rename struct drbd_conf -> struct drbd_deviceAndreas Gruenbacher
sed -i -e 's:\<drbd_conf\>:drbd_device:g' Signed-off-by: Andreas Gruenbacher <agruen@linbit.com> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
2014-02-17drbd: Split off on-the-wire protocol definitionsAndreas Gruenbacher
Keep the protocol definitions separate from the kernel code; they are useful in their own right. Signed-off-by: Andreas Gruenbacher <agruen@linbit.com> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
2014-02-17drivers: block: Mark the function as static in drbd_worker.cRashika Kheria
Mark functions drbd_endio_read_sec_final(), drbd_send_barrier(), need_to_send_barrier(), dequeue_work_batch(), dequeue_work_item() and wait_for_work() as static in drbd/drbd_worker.c because they are not used outside this file. This eliminates the following warnings in drbd/drbd_worker.c: drivers/block/drbd/drbd_worker.c:99:6: warning: no previous prototype for ‘drbd_endio_read_sec_final’ [-Wmissing-prototypes] drivers/block/drbd/drbd_worker.c:1276:5: warning: no previous prototype for ‘drbd_send_barrier’ [-Wmissing-prototypes] drivers/block/drbd/drbd_worker.c:1774:6: warning: no previous prototype for ‘need_to_send_barrier’ [-Wmissing-prototypes] drivers/block/drbd/drbd_worker.c:1798:6: warning: no previous prototype for ‘dequeue_work_batch’ [-Wmissing-prototypes] drivers/block/drbd/drbd_worker.c:1806:6: warning: no previous prototype for ‘dequeue_work_item’ [-Wmissing-prototypes] drivers/block/drbd/drbd_worker.c:1815:6: warning: no previous prototype for ‘wait_for_work’ [-Wmissing-prototypes] Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
2013-11-24block: Convert bio_for_each_segment() to bvec_iterKent Overstreet
More prep work for immutable biovecs - with immutable bvecs drivers won't be able to use the biovec directly, they'll need to use helpers that take into account bio->bi_iter.bi_bvec_done. This updates callers for the new usage without changing the implementation yet. Signed-off-by: Kent Overstreet <kmo@daterainc.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: "Ed L. Cashin" <ecashin@coraid.com> Cc: Nick Piggin <npiggin@kernel.dk> Cc: Lars Ellenberg <drbd-dev@lists.linbit.com> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Paul Clements <Paul.Clements@steeleye.com> Cc: Jim Paris <jim@jtan.com> Cc: Geoff Levand <geoff@infradead.org> Cc: Yehuda Sadeh <yehuda@inktank.com> Cc: Sage Weil <sage@inktank.com> Cc: Alex Elder <elder@inktank.com> Cc: ceph-devel@vger.kernel.org Cc: Joshua Morris <josh.h.morris@us.ibm.com> Cc: Philip Kelleher <pjk1939@linux.vnet.ibm.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Neil Brown <neilb@suse.de> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: linux390@de.ibm.com Cc: Nagalakshmi Nandigama <Nagalakshmi.Nandigama@lsi.com> Cc: Sreekanth Reddy <Sreekanth.Reddy@lsi.com> Cc: support@lsi.com Cc: "James E.J. Bottomley" <JBottomley@parallels.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com> Cc: Tejun Heo <tj@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Guo Chao <yan@linux.vnet.ibm.com> Cc: Asai Thambi S P <asamymuthupa@micron.com> Cc: Selvan Mani <smani@micron.com> Cc: Sam Bradshaw <sbradshaw@micron.com> Cc: Matthew Wilcox <matthew.r.wilcox@intel.com> Cc: Keith Busch <keith.busch@intel.com> Cc: Stephen Hemminger <shemminger@vyatta.com> Cc: Quoc-Son Anh <quoc-sonx.anh@intel.com> Cc: Sebastian Ott <sebott@linux.vnet.ibm.com> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Jerome Marchand <jmarchan@redhat.com> Cc: Seth Jennings <sjenning@linux.vnet.ibm.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Mike Snitzer <snitzer@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: "Darrick J. Wong" <darrick.wong@oracle.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Jan Kara <jack@suse.cz> Cc: linux-m68k@lists.linux-m68k.org Cc: linuxppc-dev@lists.ozlabs.org Cc: drbd-user@lists.linbit.com Cc: nbd-general@lists.sourceforge.net Cc: cbe-oss-dev@lists.ozlabs.org Cc: xen-devel@lists.xensource.com Cc: virtualization@lists.linux-foundation.org Cc: linux-raid@vger.kernel.org Cc: linux-s390@vger.kernel.org Cc: DL-MPTFusionLinux@lsi.com Cc: linux-scsi@vger.kernel.org Cc: devel@driverdev.osuosl.org Cc: linux-fsdevel@vger.kernel.org Cc: cluster-devel@redhat.com Cc: linux-mm@kvack.org Acked-by: Geoff Levand <geoff@infradead.org>
2013-03-28drbd: validate resync_after dependency on attach alreadyLars Ellenberg
We validated resync_after dependencies, if changed via disk-options. But we did not validate them when first created via attach. We also did not check or cleanup dependencies that used to be correct, but now point to meanwhile removed minor devices. If the drbd_resync_after_valid() validation in disk-options tried to follow a dependency chain in this way, this could lead to NULL pointer dereference. Validate resync_after settings in drbd_adm_attach() already, as well as in drbd_adm_disk_opts(), and and only reject dependency loops. Depending on non-existing disks is allowed and equivalent to no dependency. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-03-28drbd: abort start of resync early, if it raced with connection breakagePhilipp Reisner
We've seen a spurious full resync, because a connection breakage raced with drbd_start_resync(, C_SYNC_TARGET), and the resulting state change request intended to start the resync ended up looking like a local invalidate. Fix: Double check the state inside the lock, and don't even request that state change, if we had connection or IO problems. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-03-23drbd: Clarify when activity log I/O is delegated to the worker threadLars Ellenberg
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-03-23drbd: read meta data early, base on-disk offsets on super blockLars Ellenberg
We used to calculate all on-disk meta data offsets, and then compare the stored offsets, basically treating them as magic numbers. Now with the activity log striping, the activity log size is no longer fixed. We need to first read the super block, then base the activity log and bitmap offsets on the stored offsets/al stripe settings. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-11-09drbd: Broadcast sync progress no more often than once per secondPhilipp Reisner
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-09drbd: always write bitmap on detachLars Ellenberg
If we detach due to local read-error (which sets a bit in the bitmap), stay Primary, and then re-attach (which re-reads the bitmap from disk), we potentially lost the "out-of-sync" (or, "bad block") information in the bitmap. Always (try to) write out the changed bitmap pages before going diskless. That way, we don't lose the bit for the bad block, the next resync will fetch it from the peer, and rewrite it locally, which may result in block reallocation in some lower layer (or the hardware), and thereby "heal" the bad blocks. If the bitmap writeout errors out as well, we will (again: try to) mark the "we need a full sync" bit in our super block, if it was a READ error; writes are covered by the activity log already. If that superblock does not make it to disk either, we are sorry. Maybe we just lost an entire disk or controller (or iSCSI connection), and there actually are no bad blocks at all, so we don't need to re-fetch from the peer, there is no "auto-healing" necessary. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-09drbd: a few more GFP_KERNEL -> GFP_NOIOLars Ellenberg
This has not yet been observed, but conceivably, when using GFP_KERNEL allocations from drbd_md_sync(), drbd_flush_after_epoch() or receive_SyncParam(), we could trigger additional IO to our own device, or an other device in a criss-cross setup, and end up in a local deadlock, or potentially a distributed deadlock in a criss-cross setup involving the peer blocked in a similar way waiting for us to make progress. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-09drbd: use list_move_tail instead of list_del/list_add_tailLars Ellenberg
Using list_move_tail() instead of list_del() + list_add_tail(). spatch with a semantic match is used to found this problem. (http://coccinelle.lip6.fr/) Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-09drbd: panic on delayed completion of aborted requestsPhilipp Reisner
"aborting" requests, or force-detaching the disk, is intended for completely blocked/hung local backing devices which do no longer complete requests at all, not even do error completions. In this situation, usually a hard-reset and failover is the only way out. By "aborting", basically faking a local error-completion, we allow for a more graceful swichover by cleanly migrating services. Still the affected node has to be rebooted "soon". By completing these requests, we allow the upper layers to re-use the associated data pages. If later the local backing device "recovers", and now DMAs some data from disk into the original request pages, in the best case it will just put random data into unused pages; but typically it will corrupt meanwhile completely unrelated data, causing all sorts of damage. Which means delayed successful completion, especially for READ requests, is a reason to panic(). We assume that a delayed *error* completion is OK, though we still will complain noisily about it. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-09drbd: dequeue single work items in wait_for_work()Lars Ellenberg
As long as we still use drbd_queue_work_front(), we must only dequeue the single first item during normal operation. The comment in drbd_worker() even says so, but bc8a5a1 drbd: remove struct drbd_tl_epoch objects (barrier works) introduced the batch dequeueing again via list_splice_init() in wait_for_work(). Change back to list_move() of the first item, if any. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-09drbd: don't send out P_BARRIER with stale informationLars Ellenberg
We must only send P_BARRIER for epochs we actually sent P_DATA in. If we (re-)establish a connection, we reinitialized the send.current_epoch_nr, but forgot to reset send.current_epoch_writes. This could result in a spurious P_BARRIER with stale epoch information, and a disconnect/reconnect cycle once the then "unexpected" P_BARRIER_ACK is received: BAD! BarrierAck #28823 received, expected #28829! Introduce re_init_if_first_write() and maybe_send_barrier() helpers, and call them appropriately for read/write/set-out-of-sync requests. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-09drbd: introduce stop-sector to online verifyLars Ellenberg
We now can schedule only a specific range of sectors for online verify, or interrupt a running verify without interrupting the connection. Had to bump the protocol version differently, we are now 101. Added verify_can_do_stop_sector() { protocol >= 97 && protocol != 100; } Also, the return value convention for worker callbacks has changed, we returned "true/false" for "keep the connection up" in 8.3, we return 0 for success and <= for failure in 8.4. Affected: receive_state() Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08drbd: do not reset rs_pending_cnt too earlyLars Ellenberg
Fix asserts like block drbd0: in got_BlockAck:4634: rs_pending_cnt = -35 < 0 ! We reset the resync lru cache and related information (rs_pending_cnt), once we successfully finished a resync or online verify, or if the replication connection is lost. We also need to reset it if a resync or online verify is aborted because a lower level disk failed. In that case the replication link is still established, and we may still have packets queued in the network buffers which want to touch rs_pending_cnt. We do not have any synchronization mechanism to know for sure when all such pending resync related packets have been drained. To avoid this counter to go negative (and violate the ASSERT that it will always be >= 0), just do not reset it when we lose a disk. It is good enough to make sure it is re-initialized before the next resync can start: reset it when we re-attach a disk. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08drbd: differentiate between normal and forced detachLars Ellenberg
Aborting local requests (not waiting for completion from the lower level disk) is dangerous: if the master bio has been completed to upper layers, data pages may be re-used for other things already. If local IO is still pending and later completes, this may cause crashes or corrupt unrelated data. Only abort local IO if explicitly requested. Intended use case is a lower level device that turned into a tarpit, not completing io requests, not even doing error completion. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08drbd: remove struct drbd_tl_epoch objects (barrier works)Lars Ellenberg
cherry-picked and adapted from drbd 9 devel branch DRBD requests (struct drbd_request) are already on the per resource transfer log list, and carry their epoch number. We do not need to additionally link them on other ring lists in other structs. The drbd sender thread can recognize itself when to send a P_BARRIER, by tracking the currently processed epoch, and how many writes have been processed for that epoch. If the epoch of the request to be processed does not match the currently processed epoch, any writes have been processed in it, a P_BARRIER for this last processed epoch is send out first. The new epoch then becomes the currently processed epoch. To not get stuck in drbd_al_begin_io() waiting for P_BARRIER_ACK, the sender thread also needs to handle the case when the current epoch was closed already, but no new requests are queued yet, and send out P_BARRIER as soon as possible. This is done by comparing the per resource "current transfer log epoch" (tconn->current_tle_nr) with the per connection "currently processed epoch number" (tconn->send.current_epoch_nr), while waiting for new requests to be processed in wait_for_work(). Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08drbd: move the drbd_work_queue from drbd_socket to drbd_connectionLars Ellenberg
cherry-picked and adapted from drbd 9 devel branch In 8.4, we don't distinguish between "resource work" and "connection work" yet, we have one worker for both, as we still have only one connection. We only ever used the "data.work", no need to keep the "meta.work" around. Move tconn->data.work to tconn->sender_work. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08drbd: allow to dequeue batches of work at a timeLars Ellenberg
cherry-picked and adapted from drbd 9 devel branch In 8.4, we still use drbd_queue_work_front(), so in normal operation, we can not dequeue batches, but only single items. Still, followup commits will wake the worker without explicitly queueing a work item, so up() is replaced by a simple wake_up(). Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08drbd: simplify retry path of failed READ requestsLars Ellenberg
If a local or remote READ request fails, just push it back to the retry workqueue. It will re-enter __drbd_make_request, and be re-assigned to a suitable local or remote path, or failed, if we do not have access to good data anymore. This obsoletes w_read_retry_remote(), and eliminates two goto...retry blocks in __req_mod() Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08drbd: move put_ldev from __req_mod() to the endio callbackLars Ellenberg
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08drbd: fix potential data corruption and protocol errorLars Ellenberg
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08drbd: Fixed an obvious copy-n-paste mistakePhilipp Reisner
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08drbd: Fixes from the drbd-8.3 branchPhilipp Reisner
* drbd-8.3: drbd: fix spurious meta data IO "error" drbd: Fixed a race condition between detach and start of resync drbd: fix harmless race to not trigger an ASSERT drbd: Derive sync-UUIDs only from the bitmap-uuid if it is non-zero drbd: Fixed current UUID generation (regression introduced recently, after 8.3.11) Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08drbd: Update some outdated comments to match the codeAndreas Gruenbacher
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08drbd: Fixed w_restart_disk_io() to handle non active AL-extentsPhilipp Reisner
Since we now apply the AL in user space onto the bitmap, the AL is not active for the requests we want to reply. For that a al_write_transaction() that might be called from worker context became necessary. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08drbd: Missing assignment of mdev before drbd_queue_work()Philipp Reisner
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08drbd: detach from frozen backing devicePhilipp Reisner
* drbd-8.3: documentation: Documented detach's --force and disk's --disk-timeout drbd: Implemented the disk-timeout option drbd: Force flag for the detach operation drbd: Allow new IOs while the local disk in in FAILED state drbd: Bitmap IO functions can not return prematurely if the disk breaks drbd: Added a kref to bm_aio_ctx drbd: Hold a reference to ldev while doing meta-data IO drbd: Keep a reference to the bio until the completion handler finished drbd: Implemented wait_until_done_or_disk_failure() drbd: Replaced md_io_mutex by an atomic: md_io_in_use drbd: moved md_io into mdev drbd: Immediately allow completion of IOs, that wait for IO completions on a failed disk drbd: Keep a reference to barrier acked requests Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08drbd: Fix the upper limit of resync-afterAndreas Gruenbacher
The 32-bit resync_after netlink field takes a device minor number as parameter, which is no longer limited to 255. We cannot statically verify which device numbers are valid, so set the ummer limit to the highest possible signed 32-bit integer. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08drbd: Removing drbd_cfg_rwsemPhilipp Reisner
* Updates to all configuration items is done under genl_lock(). Including removal of mdevs or tconns. * All read non sleeping read sides are protected by rcu * All sleeping read sides keep reference counts to keep the objects alive Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08drbd: Turn no-tcp-cork into tcp-cork={yes|no}Andreas Gruenbacher
Change the --no-tcp-cork drbdsetup command line option as well as the no_cork netlink packet. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08drbd: RCU for rs_plan_sPhilipp Reisner
This removes the issue with using peer_seq_lock out of different contexts. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08drbd: Made the fifo object a self contained object (preparing for RCU)Philipp Reisner
* Moved rs_planed into it, named total * When having a pointer to the object the values can be embedded into the fifo object. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08drbd: RCU for disk_confPhilipp Reisner
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08drbd: Split drbd_alter_sa() into drbd_sync_after_valid() and ↵Philipp Reisner
drbd_sync_after_changed() Preparing RCU for disk_conf Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08drbd: fix thread stop deadlockLars Ellenberg
There are races where the receiver may be exiting, but still need the worker to process some stuff. Do not wait for the receiver to die from an exiting worker. The receiver must already be dead in case the worker decides to exit. If the receiver was still alive, it may still want to queue work, and do drbd_flush_workqueue() from it's disconnect cleanup code, which would no longer be processed by an exiting worker. This also would deadlock, if the worker was to synchornously wait for the receiver to die. Do not implicitly stop the worker. The worker will only be stopped from configuration context, from conn_reconfig_done(), drbd_adm_down() or drbd_adm_delete_connection(), after making sure the receiver is already stopped. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>