summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2011-03-10drbd: --force option for disconnectPhilipp Reisner
As the network connection can be lost at any time, a --force option for disconnect is just a matter of completeness. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: add packet_type 27 (return_code_only) to netlink apiLars Ellenberg
In case we ever should add an other packet type, we must not reuse 27, as that currently used for "empty" return code only replies. Document it as such. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: use kzalloc and memset(,0,) to start with clean buffers in drbd_nlLars Ellenberg
Make sure we start with clean buffers to not accidentally send garbage back to userspace. Note: has not been observed; but just in case. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: remove /proc/drbd before unregistering from netlinkLars Ellenberg
There still exists a (theoretical) race on module unload, where /proc/drbd may still exist, but the netlink callback has been unregistered already, allowing drbdsetup to shout without listeners, and get no reply. Reorder remove_proc_entry and unregister of netlink callback. drbdsetup first checks for existence of the proc entry, and if that is missing, won't even try to contact the module. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: increase module count on /proc/drbd accessLars Ellenberg
If someone holds /proc/drbd open, previously rmmod would "succeed" in starting the unload, but then block on remove_proc_entry, leading to a situation where the lsmod does not show drbd anymore, but /proc/drbd being still there (but no longer accessible). I'd rather have rmmod fail up front in this case. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: Removed 20 seconds upper bound for side-steppingPhilipp Reisner
Given low-enough network bandwidth combined with a IO pattern that hammers onto a single RS-extent, side-stepping might be necessary for much longer times. Changed the code to print a single informal message after 20 seconds, but it keeps on stepping aside forever. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: Becoming sync target may not happen out of < C_WF_REPORT_PARAMSPhilipp Reisner
This patch is acutally a necessary addendum to the patch "fix for spurious full sync (becoming sync target looked like invalidate)" Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: Starting with protocol 96 we can allow app-IO while receiving the bitmapPhilipp Reisner
* C_STARTING_SYNC_S, C_STARTING_SYNC_T In these states the bitmap gets written to disk. Locking out of app-IO is done by using the drbd_queue_bitmap_io() and drbd_bitmap_io() functions these days. It is no longer necessary to lock out app-IO based on the connection state. App-IO that may come in after the BITMAP_IO flag got cleared before the state transition to C_SYNC_(SOURCE|TARGET) does not get mirrored, sets a bit in the local bitmap, that is already set, therefore changes nothing. * C_WF_BITMAP_S In this state we send updates (P_OUT_OF_SYNC packets). With that we make sure they have the same number of bits when going into the C_SYNC_(SOURCE|TARGET) connection state. * C_UNCONNECTED: The receiver starts, no need to lock out IO. * C_DISCONNECTING: in drbd_disconnect() we had a wait_event() to wait until ap_bio_cnt reaches 0. Removed that. * C_TIMEOUT, C_BROKEN_PIPE, C_NETWORK_FAILURE C_PROTOCOL_ERROR, C_TEAR_DOWN: Same as C_DISCONNECTING * C_WF_REPORT_PARAMS: IO still possible since that is still like C_WF_CONNECTION. And we do not need to send barriers in C_WF_BITMAP_S connection state. Allow concurrent accesses to the bitmap when receiving the bitmap. Everything gets ORed anyways. A drbd_free_tl_hash() is in after_state_chg_work(). At that point all the work items of the last connections must have been processed. Introduced a call to drbd_free_tl_hash() into drbd_free_mdev() for paranoia reasons. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: Improvements in sanitize_state()Philipp Reisner
The relevant change is that the state change to C_FW_BITMAP_S should implicitly change pdsk to C_CONSISTENT. (Think of it as C_OUTDATED, only without the guarantee that the peer has the outdated written to its meta data) At that opportunity I restructured the switch statement so that it gets evaluated every time. (Has declarative character) Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: Fixed race condition in drbd_queue_bitmap_ioPhilipp Reisner
May only test for ap_bio_cnt == 0 under req_lock. It can increase only under req_lock. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: Fixed inc_ap_bio()Philipp Reisner
The condition must be checked after perpare_to_wait(). The old implementaion could loose wakeup events. Never observed in real life. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: use test_and_set_bit() to decide if bm_io_work should be queuedPhilipp Reisner
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: Begin to account BIO processing time before inc_ap_bio()Philipp Reisner
Since inc_ap_bio() might sleep already Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: Implemented side-stepping in drbd_res_begin_io()Philipp Reisner
Before: drbd_rs_begin_io() locked app-IO out of an RS extent, and waited then until all previous app-IO in that area finished. (But not only until the disk-IO was finished but until the barrier/epoch ack came in for that == round trip time latency ++) After: As soon as a new app-IO waits wants to start new IO on that RS extent, drbd_rs_begin_io() steps aside (clearing the BME_NO_WRITES flag again). It retries after 100ms. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: Make some functions staticPhilipp Reisner
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: Implemented priority inheritance for resync requestsPhilipp Reisner
We only issue resync requests if there is no significant application IO going on. = Application IO has higher priority than resnyc IO. If application IO can not be started because the resync process locked an resync_lru entry, start the IO operations necessary to release the lock ASAP. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: Do not cleanup resync LRU for the Ahead/Behind SyncSource/SyncTarget ↵Philipp Reisner
transitions This one should be replaced with moving this cleanup to the 'right' position. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: When proxy's buffer drained off go into regular resync modePhilipp Reisner
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: New packet for Ahead/Behind mode: P_OUT_OF_SYNCPhilipp Reisner
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: Implemented two new connection states Ahead/BehindPhilipp Reisner
In this connection mode, the ahead node no longer replicates application IO. The behind's disk becomes out dated. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: New configuration parameters for dealing with network congestionPhilipp Reisner
net { on_congestion {block|pull-ahead|disconnect}; congestion-fill {sectors}; congestion-extents {al-extents}; } Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: Track the numbers of sectors in flightPhilipp Reisner
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: Renamed write_flags_to_bio() to wire_flags_to_bio()Lars Ellenberg
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: restore compatibility with 32bit kernelsLars Ellenberg
With commit drbd: further converge progress display of resync and online-verify accidentally an u64/u64 div was introduced, causing an unresolvable symbol __udivdi3 to be reference. Actually for that division, 32bit are still suficient for now, so we can revert to unsigned long instead. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: properly use max_hw_sectors to limit the our bio sizeLars Ellenberg
To ease tracking of bios in some hash tables, we want it to not cross certain boundaries (128k, used to be 32k). We limit the maximum bio size using queue parameters. Historically some defines and variables we use there have been named max_segment_size, which was misguided. Rename them to max_bio_size, and use [blk_]queue_max_hw_sectors where appropriate. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: debug: limit nelink-broadcast of request on digest mismatch to 32kLars Ellenberg
We used to be limited to 32k requests, but have increased that limit to 128k now. This part of the code can only deal with 32k, it would scramble arbitrary pages for larger requests. As it is used for debugging only anyways, it is ok to simply truncate the dumped data here. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: detect modification of in-flight buffersLars Ellenberg
With data-integrity digest enabled, double-check on the sending side for modifications by upper layers of buffers under write back, so we can tell it appart from corruption on the "wire". Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: further converge progress display of resync and online-verifyLars Ellenberg
Show progressbar and ETA always, with proc_details >= 1 also show the current sector position for both resync and online-verify on both nodes. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: fix potential wrap of 32bit oos:%lu display in /proc/drbdLars Ellenberg
When converting bits (4k resolution, still) to kB, we shift left. If it was a large number of bits on a 32bit box (>= 4 TiB storage), we may wrap the 32bit unsigned long base type, resulting in incorrect display. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: use the resync controller for online-verify requests as wellLars Ellenberg
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: factor out drbd_rs_number_requestsLars Ellenberg
Preparation patch to be able to use the auto-throttling resync controller for online-verify requests as well. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: factor out drbd_rs_controller_resetLars Ellenberg
Preparation patch to be able to use the auto-throttling resync controller for online-verify requests as well. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: show progress bar and ETA for online-verifyLars Ellenberg
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: advance progress step marks for online-verifyLars Ellenberg
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: factor out advancement of resync marks for progress reportingLars Ellenberg
This is in preparation to unify progress reporting of online-verify and resync requests. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: initialize online-verify progress tracking on verify targetLars Ellenberg
For partial (resumed) online verify, initialize the resync step marks once we know what the online verify start sector is. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: improve online-verify progress trackingLars Ellenberg
For a partial (resumed) online-verify, initialize rs_total not to total bits, but to number of bits to check in this run, to match the meaning rs_total has for actual resync. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: only reset online-verify start sector if verify completedLars Ellenberg
For network hickups during online-verify, on the next verify triggered, we by default want to resume where it left off. After any replication link interruption, there will be a (possibly empty) resync. Do not reset online-verify start sector if some resync completed, that would defeats the purpose. Only reset the start sector once a verify run is completed. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10Merge branch 'for-2.6.39/stack-plug' into for-2.6.39/coreJens Axboe
Conflicts: block/blk-core.c block/blk-flush.c drivers/md/raid1.c drivers/md/raid10.c drivers/md/raid5.c fs/nilfs2/btnode.c fs/nilfs2/mdt.c Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-10blk-throttle: Use blk_plug in throttle dispatchVivek Goyal
Use plug in throttle dispatch also as we are dispatching a bunch of bios in throttle context and some of them might merge. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-10block: kill off REQ_UNPLUGJens Axboe
With the plugging now being explicitly controlled by the submitter, callers need not pass down unplugging hints to the block layer. If they want to unplug, it's because they manually plugged on their own - in which case, they should just unplug at will. Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-10aio: remove request submission batchingJens Axboe
This should be useless now that we have on-stack plugging. So lets just kill it. Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-10fs: make aio plugShaohua Li
Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-10fs: make mpage read/write_pages() plugJens Axboe
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-10read-ahead: use pluggingJens Axboe
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-10fs: make generic file read/write functions plugJens Axboe
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-10block: remove per-queue pluggingJens Axboe
Code has been converted over to the new explicit on-stack plugging, and delay users have been converted to use the new API for that. So lets kill off the old plugging along with aops->sync_page(). Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-10block: initial patch for on-stack per-task pluggingJens Axboe
This patch adds support for creating a queuing context outside of the queue itself. This enables us to batch up pieces of IO before grabbing the block device queue lock and submitting them to the IO scheduler. The context is created on the stack of the process and assigned in the task structure, so that we can auto-unplug it if we hit a schedule event. The current queue plugging happens implicitly if IO is submitted to an empty device, yet callers have to remember to unplug that IO when they are going to wait for it. This is an ugly API and has caused bugs in the past. Additionally, it requires hacks in the vm (->sync_page() callback) to handle that logic. By switching to an explicit plugging scheme we make the API a lot nicer and can get rid of the ->sync_page() hack in the vm. Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-10scsi: convert to blk_delay_queue()Jens Axboe
It was always abuse to reuse the plugging infrastructure for this, convert it to the (new) real API for delaying queueing a bit. A default delay of 3 msec is defined, to match the previous behaviour. Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-10ide-cd: convert to blk_delay_queue() for a short pauseJens Axboe
It was always abuse to reuse the plugging infrastructure for this, convert it to the (new) real API for delaying queueing a bit. Signed-off-by: Jens Axboe <jaxboe@fusionio.com> Acked-by: David S. Miller <davem@davemloft.net>