From 588377d6199034c36d335e7df5818b731fea072c Mon Sep 17 00:00:00 2001 From: Alex Elder Date: Mon, 8 Oct 2012 20:37:30 -0700 Subject: rbd: reset BACKOFF if unable to re-queue If ceph_fault() is unable to queue work after a delay, it sets the BACKOFF connection flag so con_work() will attempt to do so. In con_work(), when BACKOFF is set, if queue_delayed_work() doesn't result in newly-queued work, it simply ignores this condition and proceeds as if no backoff delay were desired. There are two problems with this--one of which is a bug. The first problem is simply that the intended behavior is to back off, and if we aren't able queue the work item to run after a delay we're not doing that. The only reason queue_delayed_work() won't queue work is if the provided work item is already queued. In the messenger, this means that con_work() is already scheduled to be run again. So if we simply set the BACKOFF flag again when this occurs, we know the next con_work() call will again attempt to hold off activity on the connection until after the delay. The second problem--the bug--is a leak of a reference count. If queue_delayed_work() returns 0 in con_work(), con->ops->put() drops the connection reference held on entry to con_work(). However, processing is (was) allowed to continue, and at the end of the function a second con->ops->put() is called. This patch fixes both problems. Signed-off-by: Alex Elder Reviewed-by: Sage Weil diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c index 159aa8b..cad0d17 100644 --- a/net/ceph/messenger.c +++ b/net/ceph/messenger.c @@ -2300,10 +2300,11 @@ restart: mutex_unlock(&con->mutex); return; } else { - con->ops->put(con); dout("con_work %p FAILED to back off %lu\n", con, con->delay); + set_bit(CON_FLAG_BACKOFF, &con->flags); } + goto done; } if (con->state == CON_STATE_STANDBY) { -- cgit v0.10.2 From 9bd952615a42d7e2ce3fa2c632e808e804637a1a Mon Sep 17 00:00:00 2001 From: Sage Weil Date: Wed, 24 Oct 2012 16:12:58 -0700 Subject: libceph: avoid NULL kref_put when osd reset races with alloc_msg The ceph_on_in_msg_alloc() method drops con->mutex while it allocates a message. If that races with a timeout that resends a zillion messages and resets the connection, and the ->alloc_msg() method returns a NULL message, it will call ceph_msg_put(NULL) and BUG. Fix by only calling put if msg is non-NULL. Fixes http://tracker.newdream.net/issues/3142 Signed-off-by: Sage Weil diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c index cad0d17..3ef1759 100644 --- a/net/ceph/messenger.c +++ b/net/ceph/messenger.c @@ -2750,7 +2750,8 @@ static int ceph_con_in_msg_alloc(struct ceph_connection *con, int *skip) msg = con->ops->alloc_msg(con, hdr, skip); mutex_lock(&con->mutex); if (con->state != CON_STATE_OPEN) { - ceph_msg_put(msg); + if (msg) + ceph_msg_put(msg); return -EAGAIN; } con->in_msg = msg; -- cgit v0.10.2 From 52eb5a900a9863a8b77a895f770e5d825c8e02c6 Mon Sep 17 00:00:00 2001 From: David Zafman Date: Thu, 18 Oct 2012 14:01:43 -0700 Subject: ceph: fix dentry reference leak in encode_fh() Call to d_find_alias() needs a corresponding dput() This fixes http://tracker.newdream.net/issues/3271 Signed-off-by: David Zafman Reviewed-by: Sage Weil diff --git a/fs/ceph/export.c b/fs/ceph/export.c index 8e1b60e..8628870 100644 --- a/fs/ceph/export.c +++ b/fs/ceph/export.c @@ -90,6 +90,8 @@ static int ceph_encode_fh(struct inode *inode, u32 *rawfh, int *max_len, *max_len = handle_length; type = 255; } + if (dentry) + dput(dentry); return type; } -- cgit v0.10.2